vouume 4. 


. 
/ 


WHAT’S HAPPENING IN THE 
MATHEMATICAL 


SCIENCES 
1998-1999 


\ ’ ds: rxY CIPRA 
cor 


WHAT’S HAPPENING IN THE 
MATHEMATICAL 


SCIENCES 


meee, 8 = | YY 9 9 


VoLuME 4. 


WHAT’S HAPPENING IN THE 
MATHEMATICAL 


SCIENCES 
1998-1999 


BARRY CIPRA 


Edited by 
Paul Zorn 


Advisory Board 
Noga Alon Robert Osserman 
Randolph E. Bank Carl Pomerance 
Herbert S. Wilf 


1991 Mathematics Subject Classification: 
00A06 


ISBN 0-8218-0766-8 
ISSN 1065-9358 


Copying and reprinting, Material in this 
book may be reproduced by any means 
for educational and scientific purposes 
without fee or permission with the excep- 
tion of reproduction by services that col- 
lect fees for delivery of documents and 
provided that the customary acknowledg- 
ment of the source is given. This consent 
does not extend to other kinds of copying 
for general distribution, for advertising or 
promotional purposes, or for resale. 
Requests for permission for commercial 
use of material should be addressed to the 
Assistant to the Publisher, American 
Mathematical Society, P. O. Box 6248, 
Providence, RI 02940-6248. Requests can 
also be made by e-mail to: 
reprint-permission@ams.org. 


©1999 by the American Mathematical Society. 
All rights reserved. 

The American Mathematical Society retains 
all rights except those granted by the 
United States Government. 

Printed in the United States of America. 
© The paper used in this book is acid-free 
and falls within the guidelines 
established to ensure permanence 
and durability. 

© Printed on recycled paper. 
10987654321 030201 0099 


This publication was created using the 
QuarkxPress* desktop publishing system 
on a Macintosh Quadra computer. 
Halftones were created from original pho- 


tographs with Adobe Photoshop, and illus- 
trations were redrawn using Adobe 


Illustrator on Power Macintosh comput- 
ers. 


Printed at Nittany Valley Offset, 
State College, PA. 


About the Author: Barry Cipra, who also did the writing for vol- 
umes 1, 2, and 3 of What's Happening in the Mathematical 
Sciences, is a freelance mathematics writer based in Northfield, 
Minnesota. He is currently a Contributing Correspondent for 
Science magazine and also writes regularly for SIAM News, the 
newsletter of the Society for Industrial and Applied Mathematics. 
He received the 1991 Merten M. Hasse Prize from the 
Mathematical Association of America for an expository article on 
the Ising model, published in the December 1987 issue of the 
American Mathematical Monthly. His book, Misteaks...and how to 
find them before the teacher does... (a calculus supplement), is pub- 
lished by Academic Press. 


About the Editor: Paul Zorn, who also did the editing for vol- 
umes 2 and 3 of What's Happening in the Mathematical Sciences, 
is Professor of Mathematics at St. Olaf College in Northfield, 
Minnesota. He received the 1987 Carl B. Allendoerfer Award from 
the Mathematical Association of America for an expository article 
on the Bieberbach conjecture. He is currently editor of 
Mathematics Magazine. 


Cover: Detail from “Wave-Particle Duality” by Mel Fisher (see 
page 86). Fisher, a Philadelphia-based artist, creates elaborate 
sculptures by cutting sinusoidal patterns into strips of wood or, in 
this case, white plastic and then gluing the pieces together. Colored 
lights add lustre to the piece, which is Fisher’s artistic interpretation 
of one of the fundamental principles—and paradoxes—of quantum 
mechanics, the fact that light is both a wave and a particle. (Photo 
courtesy of Mel Fisher.) 


The AMS gratefully acknowledges the support of the Alfred P. 


Sloan Foundation for the publication and distribution of What's 
Happening in the Mathematical Sciences. 


Contents 


A Blue-Letter Day 3 

for Computer Chess 

Deep Blue’s victory over Garry Kasparov is the end of 

a long road in computer chess, but the mathematical 
_ study of “perfect” play in combinatorial games has an 

even longer ways to go. 


A Prime Case of Chaos 21 


Quantum chaologists and analytic number theorists 
have their sights set on a mysterious mathematical 
object: the Riemann zeta function. 


Proof by Example: A 37, 
Mathematician’s Mathematician 


Paul Erdos, the “Johnny Appleseed” of mathematics, 
inspired hundreds of colleagues and left a rich legacy of 
mathematical problems—and solutions. 


Computers Take Algebraic 47 
Geometry Back to Its Roots 


Computer power has brought algorithmic questions in 
algebraic geometry back to the fore. 


As Easy As EQP 59 


An automated theorem prover succeeds in settling a 
decades-old conjecture in symbolic logic. What's left 
for human minds to do? 


Beetlemania: Chaos in Ecology 73 


A collaboration between mathematicians and biologists 
has led to the first experimental evidence for chaotic 
dynamics in a population. 


From Wired to Weird 83 


Mathematical discoveries are shaping research in a 
potentially revolutionary new kind of computing, based 
on principles of quantum mechanics. 


Tales from the Cryptosystem 97 


A breakthrough in the theory of computational com- 
plexity has implications for cryptographic systems with 
“guaranteed” security. 


But Is It Math? 103 


Mathematics and art have more in common than is com- 
monly supposed. Two twentieth-century artists, M.C. 
Escher and Marcel Duchamp, used mathematics as an 
inspiration for works of art. 


Mathematical Discovery 115 
(by Henri Poincaré) 
Henri Poincaré’s thoughts on thinking, written near the 


beginning of the twentieth century, are well worth 
repeating at century’s end. 


Introduction 


o—or what—is the world’s best chess player? Is quantum 
chaos lurking within the theory of numbers? Will future 
computers cash in on the weirdness of quantum mechanics? Can 
populations behave chaotically? Did a mathematical essay by 
Henri Poincaré inspire the revolutionary development of modern 
art? These are among the fascinating questions addressed in the 
present volume of What's Happening in the Mathematical 
Sciences. 

Even though these articles describe many cutting-edge areas of 
modern pure and applied mathematical research, they can be read 
and appreciated by anyone with an interest in mathematics. Their 
author, Barry Cipra, is a well-known freelance mathematics writer 
who is a Contributing Correspondent for Science magazine and 
also writes for SJAM News. In reading these articles you should 
also get a feeling for the beauty and universality of mathematics. 

The American Mathematical Society hopes that you will find 
this book both enjoyable and intellectually rewarding. 


Donald Babbitt 
Publisher of the AMS 


Deep Blue. Js the chess-playing computer contemplating its next move—or is it thinking about something else? 
(Photograph courtesy of IBM.) 
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A Blue-Letter Day for 


Computer Chess 


t was bound to happen sooner or later. By some estimates it hap- 

pened sooner. By others, later. But on May 11, 1997, it hap- 
pened: A computer program wrested from human hands the title of 
world’s best chess player. 

A machine named Deep Blue, designed by a team of researchers 
at IBM’s T. J. Watson Research Center in Yorktown Heights, New 
York, beat Garry Kasparov, the reigning champion of the 
International Chess Federation since 1985, in a 6-game match. 
Deep Blue's victory capped four decades’ progress in computer 
chess, a spin-off of the larger effort in computer science known as 
artificial intelligence. The win represents a milestone in an essen- 
tially mathematical problem: How do you organize the search for 
a “best” solution when the number of possibilities is beyond com- 
prehension? But it leaves open another mathematical problem, one 
that’s best phrased in theological terms: How does God play chess? 

The concept of artificial intelligence—designing machines that 
“think” like human beings—has been around for ages. In the sev- 
enteenth century, Gottfried Leibniz (Isaac Newton's rival in the dis- 
covery of calculus) dreamed of settling philosophical disputes by 
means of a logical calculator. Others have argued for the potential 
of “creative engines.” But it was only the advent of high-speed 
computers in the 1940's and 50's that gave researchers confidence 
that artificial intelligence, Al for short, was within reach. The pro- 
cessing power of electronic machines, they predicted, would soon 
make AI a practical—perhaps even a threatening—reality. 

Things haven't worked out that way. Artificial intelligence has 
proved to be a lot more elusive than AI enthusiasts first projected. 
To be sure, there’s been a lot of progress. But much of it has come 
in specific, limited applications, such as computer vision for quali- 
ty control in manufacturing plants—‘expert systems” that focus 
their reasoning powers on a single subject. The goal of imitating 
the nimble wit of the human mind remains unrealized. 

Computer chess has been both the measure of Al’s frustration 
and, oddly enough, a signpost of AI’s success. Early AI pioneers 
such as Herbert Simon, a Nobel prize-winning economist at 
Carnegie—Mellon University, predicted in the 1950’s that computer 
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programs would defeat the best chess 
players in the world within a decade. 

When that didn’t happen, the pendulum 

of opinion swung the other way. Some 

computer scientists speculated that com- 
puters would never play chess as well as 
people. Many said that AI was a waste of 
time. But a few kept plugging—and pro- 
gramming—away. 
Why would anyone expect computers to excel 
at chess? And why was Deep Blue’s victory so long 
in coming? 
In principle, machines should excel at chess, because 
much of the game depends on carefully examining the con- 
sequences of particular moves. A single oversight can be fatal, 
as novice players learn over and over. But a properly programmed 
computer doesn’t make that kind of mistake. Untroubled by bore- 
dom, a computer can accurately envision the board a dozen or more 
moves ahead, without worrying that 
something important has been over- 
looked. (Of course computers don’t 
worry anyway—yet.) If the secret of 
chess is in the ability to look ahead, com- 
puters should do well indeed. 

But there seems to be more to chess than 
that. It’s not enough just to see what the 
board might look like, it’s also necessary to 

evaluate and compare the possible positions. 
And now, things get tricky. 
Over the years, chess players have developed var- 
ious heuristics for evaluating chess positions. One sim- 
ple approach is to assign a value to each piece—pawns are 
commonly worth | point, knights and bishops 3, rooks 5, and 
the queen 9 (the king, of course, is beyond value)—and then take 
the difference between your pieces’ total value and that of your 


Figure 1. A simple 
3-move game tree with 


“pay-offs” at its leaves (top), opponent's pieces, with additional fractions of a point added or 
backtracked to a value at the root subtracted for pieces that are threatened or pieces that are protect- 
by the min-max algorithm (bottom). ns 


Tf such an “evaluation function” were an accurate gauge of who 
holds the advantage, chess would be a dull, mechanical exercise: 
Each player would simply consider all his or her possible moves, 
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evaluate each possible resulting position, and then make the move 
with the highest “score.” Actually the game would be duller yet: 
Any player with a disadvantage would immediately resign. The 
game would be interesting only as long as neither player held an 
advantage. 

Fortunately, no one has come up with an absolutely accurate 
evaluation function—and it’s unlikely anyone ever will. So players 
(of the silicon species at least) try to parlay imperfect evaluations 
into better approximations by combining them with what's known 
as game-tree analysis. 

A game tree is an abstract representation of a set of possible 
sequences of moves by the two players, Black and White. For sim- 
plicity, suppose each player has just two alternatives for each move, 
and we look only three moves ahead (that is, White’s first move, 
Black's response, and White’s counter-response). The game tree 
can be drawn as in Figure | (top). The positions White moves from 
are shown as circles, while those for Black are squares. Finally, 
we’ve made up values that White might assign to the positions at 
the end of three moves. 

The essence of game-tree analysis is to backtrack from the val- 
ues at the “leaves” of the tree to a value at its “root.” This is done 
by assigning to each branch point either the maximum or the min- 
imum of the values immediately above it. Since the circular branch 
points in Figure 1 (top) correspond to positions from which White 
moves, these points take the maximum of the values above them— 
in essence, White rejects all but his or her best options. Black, on 
the other hand, wants to steer things toward White’s worst option, 
so the square branch points are filled with the minimum of the (cir- 
cular) branch points above them. The appropriate max’s and min’s 
can be computed and filled in, working from top to bottom of the 
game tree as shown in Figure | (bottom). 

In theory, a game tree can be extended to the very end of play, 
at which point only three values are required: 1, —1, and 0, repre- 
senting Win, Lose, and Draw. But to actually draw such a tree for 
a game like chess is physically impossible: The number of possi- 
ble positions on a chessboard outstrips the number of elementary 
particles in the universe. Even so trivial a game as tic-tac-toe has a 
healthy-looking shrub of a game tree. If God plays chess by con- 
sulting a full-fledged game tree, She must have resources we 
haven’t dreamed of. 

But it’s not strictly necessary to see the entire game tree to 
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evaluate lines of play. In what's known as depth-first 
Re search, the analysis systematically explores one line of 
play after another, assigns tentative values to branch 
points, and ultimately finds the correct value (and open- 


C ing move) for the root of the tree (see Figure 2). At 
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each stage, only one line of play is actually drawn (i.e., 
kept in computer memory). This solves the physical 
problem of trying to draw a zillion branches. 
Unfortunately it doesn’t reduce the amount of time 
required, because you still have to examine the entire 
tree. 

The latter problem can be ameliorated by judicious 
“pruning” of the game tree. In Figure 2, for example, 
it’s unnecessary to worry about White's right-hand 
option for the first move beyond the line of play exam- 
ined in step 6, because the result there implies that the 
“return” will be less than what's guaranteed by the left- 
hand option. This elementary deduction cuts the search 
by 25 percent. Of course pruning entails “costs” of its 
own, which, for a game tree this small, are hardly worth 
paying. But in realistically large settings, this kind of 
pruning is invaluable. It’s what people do, for example, 
when they analyze tic-tac-toe, and it’s part of the 
“search engine” that helped Deep Blue beat Kasparov. 


Figure 2. Depth-first search of a game tree minimizes Roughly speaking, pruning works best when the 


the amount of information that must be retained at any 
stage of analysis, but doesn't in itself shorten the 


search happens to look at “good” lines of play first. 


amount of time required for the search. Judicious Ideally, the search would start with what turns out to be 


pruning can help. 
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the best line of play and then start ruling out alterna- 

tives. But obviously this can’t be guaranteed—other- 
wise, there'd be no need to search at all! Much of the “art” in AI 
lies in designing heuristic rules to organize efficient searches by 
anticipating the outcome. 

Even with sharp pruning shears—or for that matter, a chain- 
saw—computers are still not powerful enough to do a complete 
search of the game tree for chess. They probably never will be— 
there are just too many branches. Instead, chess-playing programs 
like Deep Blue look only a certain distance along each line of play, 
use an evaluation function to assign comparative values to the 
“final” positions, and then backtrack these values to the current 
position (pruning along the way). The idea is that even an imper- 
fect evaluation function can distinguish good lines of play from 


bad; at the very least, it will not recommend a move that leads to 
immediate disaster. 

Experience has shown this approach to work. Researchers in 
computer chess have found that increasing the level to which their 
programs search the game tree is the single most effective thing 
they can do—something like a football coach’s realization that the 
easiest way to win is to have big, strong players. Not that that’s 
enough, either in chess or in football. But having a chess program 
with a lot of depth play one that looks only a few moves ahead is 
like pitting the Green Bay Packers against a high-school JV squad. 

In the 1970's, Ken Thompson, a computer scientist at AT&T 
Bell Laboratories, found that each additional level of search—a 
level, or “ply,” being a move by one player—is worth approxi- 
mately 200 points in the rating system used by the Fédération 
Internationale des Echecs (FIDE) (see Figure 3). The FIDE system 
uses a complicated (and somewhat arbitrary) formula to convert a 
player’s record in competition into a numerical value. Kasparov 
currently (as of 1998) ranks at the all-time high of 2806 points, with 
Vladimir Kramnik of Russia in second at 2790 Anand Viswanathan 
of India trailing at 2770, (Bobby Fischer disappeared with a rating 
of 2594.) Extrapolating Thompson’s data, a computer should out- 
rank Kasparov at around 12 or 13 levels of search. 

Deep Blue can go 14 levels deep. 

The depth to which a computer program can afford to search 


4 5 6 is 8 9 Rating 
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Figure 3. Deeper searches result in better play. In the 1970's, Ken 
Thompson pitted versions of the chess program Belle against one another 
in 20-game matches. The versions ranged from 4- to 9-ply searches. The 
tournament results show, for example, Belle-9 beating Belle-8 14.5 games 
out of 20 (draws count as half victories for each player). As expected, 
Belle-9 came out on top, with an estimated FIDE rating of 2328. 
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Depth-first search and pruning of the game tree are just 
the starting point for Deep Blue’s chess strategy. Other tech- 
niques include iterative deepening, selective extension, and 
transposition tables. Each has applications in other comput- 
er search programs. 

In iterative deepening, the computer first finds the “best” 
move suggested by a search at, say 10 levels deep, then 
recomputes the search at levels 11, 12, and so on. Given the 
exponential growth of the game tree, the repetitive nature of 
the shallow searches is only a minor penalty, offset by the 
advantage that if for some reason the machine runs out of 
time during, say the depth-14 search, it still has a decent, 
depth-13 move to play. Iterative deepening can also suggest 
the order in which the game tree should be searched at suc- 
cessive levels, increasing the pruning efficiency. 

Selective extension is a technique whereby the computer 
sacrifices the full depth to which it is capable of searching in 
order to concentrate on the most promising, or problematic, 
lines of play. In other words, instead of exhaustively search- 


ing to depth 14, Deep Blue may follow most lines of play 
only to depth, say 12, but follow some lines much further. 
The positions singled out for closer scrutiny are those that 
seem far superior to their alternatives. In essence, selective 
extension acknowledges the fact that the evaluation function 
is an imperfect indicator of a position’s strength or 


Figure 4. White to move and win, by capturing one of Black's knights— 
on the 243rd move! An endgame-analyzing program has shown that, bar- 
ring a mistake by Black, White cannot win any more quickly than that. 
(From “Multilinear Algebra and Chess Endgames,” Lewis Stiller. In 
Games of No Chance, Richard J. Nowakowski, ed.) 
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weakness, and can sometimes be fooled. 

A transposition table is a collection of positions the com- 
puter generates during a search. The idea is that many posi- 
tions occur repeatedly in the game tree, because they result 
from the same pieces being moved in different orders. Once 
a position has been evaluated, there is no need to redo the 
computation until it’s time to deepen the search, 

Transposition tables are, in fact, the key to analyzing 
endgames, when only a few pieces remain. If, for example, 
one player still has a queen while the other is down to a rook, 
the computer can literally generate every possible position. 
The analysis becomes a (nontrivial) exercise in backtracking 
from the checkmate positions to those that give checkmate in 
one move, then those that give it in two moves, three moves, 
and so on. 

In general, endgame analysis gets harder as more pieces 
are included. Researchers are slowly building a database of 
all endgames with a handful of pieces. Most recently, using 
techniques from multilinear algebra, Lewis Stiller, a com- 
puter scientist at Johns Hopkins University, compete 
database of 6-piece (pawnless) endgames—well beyond 
what had been thought feasible. Stiller's analysis includes 
the longest endgame on record: a battle that requires 243 
moves for White to position itself for a win by capturing one 
of Black's pieces (see Figure 4). It appears in a 1996 book 
on combinatorial games edited by Richard Nowakowski at 
Dalhousie University in Halifax, Nova Scotia, and titled, 
appropriately, Games of No Chance. 


remains largely a matter of speed. Even with extensive pruning, the 
number of lines of play that need to be examined grows exponen- 
tially with the number of levels. To climb to the fourteenth level, 
the computer needs to search through upwards of 60 billion lines of 
play. The rules of chess limit the amount of time available to each 
player: On average, each move should be decided within five min- 
utes. That means a computer must be able to evaluate around 200 
million positions per second. 

Deep Blue can do that. The machine that beat Kasparov is an 
IBM RS/6000 SP, with 256 processors working in parallel. IBM 
sells these machines mainly to data-intensive industries in such 
fields as petroleum engineering and financial forecasting. The 
“official” reason, in fact, for developing Deep Blue is to advance 
computer search techniques, which are vital to a spectrum of appli- 
cations. The computer science gains would have remained even if 
Kasparov had beaten the machine. 

But victory was still sweet for the Deep Blue team. The new 
champ had been a dozen years in the making. The story starts in 
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1985, when Feng-hsiung Hsu, then a graduate student at 
Carnegie—Mellon University, developed a single-chip circuit for 
generating chess moves. Thomas Anantharaman, then also a grad- 
uate student at CMU, supplied a simple chess-playing program. 
They named the resulting machine Chiptest. 

Murray Campbell, an expert-level chess player, joined the team 
in 1986. Campbell prepared an “opening book” (a collection of 
lines of play at the beginning of the game that every serious chess 
player knows by heart—or circuit) for Chiptest and made changes 
in the program’s evaluation function. At this point the machine was 
searching approximately 50,000 positions per second. 

Chiptest’s first foray into competitive chess, in November, 1986, 
was not impressive. The Carnegie~Mellon team entered the North 
American Computer Chess Championship, sponsored by the 
Association for Computing Machinery (ACM). Their machine 
won two games, drew one, and lost two. But in 1987, an improved 
Chiptest, renamed Chiptest-M (the “M” because the search engine 
had been rewritten in microcode, which sped up the search to 
around 500,000 positions per second), took the ACM title with four 
wins and no draws or losses. 

In 1989, Chiptest’s successor, Deep Thought (named after the 
ponderous computer in Douglas Adams’s The Hitchhiker's Guide 
to the Galaxy), now searching more than 700,000 positions per sec- 
ond on two custom-built processors and with an opening book of 
about 5000 positions, won the world computer chess championship 
with a 5-0 score. Later that year, Hsu and Campbell joined IBM 
and moved Deep Thought to the Thomas J. Watson Research 
Center in Yorktown Heights, New York. (Anantharaman joined 
them in 1990, but left IBM for another job later that year.) 

That October, Deep Thought had its first encounter with the ulti- 
mate chess-playing machine: Garry Kasparov. 

Unlike many high-level chess players, Kasparov welcomed the 
challenge of man—machine competition. (It’s been said of Kasparov 
that he would pit his skills against those an alien from outer space.) 
He agreed to a two-game match with Deep Thought. 

Kasporov won both games easily. Commenting on the first 
drubbing, he remarked that “after such a loss, no human would 
come back for more.” 

But the computer wasn’t programmed to feel humiliated. In 
1990, Joe Hoane and Jerry Brody joined Hsu and Campbell. A year 
later, Deep Thought, now running on an IBM RS/6000 with 24 cus- 


tom-built chess processors, searching more than six million posi- 

tions per second, won all five of its games on the way to its second 

straight ACM International Computer Chess Championship title. 
Chung-Jen Tan became project supervisor in 1992. In 1993, 


Deep Thought was renamed Deep Blue, in homage to IBM's cor- 
porate color. Two years later, Kasparov agreed to a six-game match 
with the IBM machine, to be played in Philadelphia in February, 
1996, in conjunction with a meeting of the ACM celebrating the 
fiftieth “birthday” of ENIAC, the original electronic computer, built 
at the University of Pennsylvania, There was a tidy purse at stake: 
$400,000 to the winner and $100,000 to the loser. 

The machine that faced Kasparov in 1996 had 192 chess proces- 
sors, capable of evaluating 100 million positions per second. The 
Deep Blue team also now boasted its own international grandmas- 
ter, Joel Benjamin, who worked on Deep Blue’s opening book. 
Even so, Kasparov was confident of victory. 

He was in for a surprise: Deep Blue beat him in the first game. 
On its 29th move, the machine surprised the chess experts observ- 
ing the game, ostensibly weakening its position in order to capture 
one of Kasparov’s pawns. But Deep Blue had calculated correctly. 
Kasparov was forced to resign on his 37th move. 

It was the champion’s first loss to a computer in a regulation 
game. (He had lost a game of speed chess to a Pentium PC in 
1994). But Kasparov regrouped, winning three of the last five 
games and drawing the other two. 


The Deep Blue Team. Sitting left to 
right, Joel Benjamin, Chung-Jen Tan. 
Standing left to right, Jerry Brody, 
Murray Campbell, Feng-hsiung Hsu, 
and Joe Hoane. (Photograph courtesy 
of IBM.) 
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The 1997 rematch, with a $1.1 million purse ($700,000 to the 
winner, $400,000 to the loser), pitted Kasparov against a still more 
powerful machine, with an expanded opening book, a refined eval- 
uation function, and other improvements. This time, however, it 
was Kasparov who shocked the machine, with a win in the first 
game. 

The computer came back to take game two, and then drew the 
next three games. In game six, Kasparov made an uncharacteristic 
blunder on his seventh move, and angrily resigned on his 19th 
move. Deep Blue had won the match. 

Does that mean Deep Blue is the better player? Kasparov cer- 
tainly doesn’t think so. In a press conference right after the match, 
he tore into the circumstances of the match, which he said favored 
the machine. “If Deep Blue will start playing competitive chess, I 
personally guarantee you I'll tear it to pieces,” he said. “I think 
eventually machines will prevail, but I don’t think that you can take 
today as the day of doom.” 

Many of Kasparov's fans agree with their human hero, noting a 
crucial asymmetry between the competitors: Deep Blue was able 
to prepare for its opponent by studying Kasparov's previous games, 
but the champion had no access to games that the machine, in its 
current design, had played. A rematch might settle the issue, but 
that looks unlikely: The machine that beat Kasparov has been dis- 
mantled. Publicity-wise, IBM has little to gain from a rematch, so 
the question of who’s the better player will most likely remain the 
subject of chessboard debate. 

But from a purely mathematical standpoint, the question is irrel- 
evant. Neither Deep Blue nor Kasparov has a purchase on “per- 
fect” play. Even if (or when) a computer becomes unbeatable, it’s 
unlikely there will be a proof that it always makes the best move. 
The complicated rules of chess seem to preclude the kind of analy- 
sis mathematicians use to “short-circuit” the game tree. 

Even for much simpler games than chess, combinatorial game 
theorists say there are plenty of challenging problems. Take 
“Grundy’s Game,” for example. Two players take turns splitting 
piles of coins into unequal piles, the game ending when every pile 
has just one or two coins (the winner is the last player to make a 
move). Researchers have analyzed Grundy’s Game with up to a 
billion coins, but without finding a provable pattern. 

For Richard Guy, a mathematician at the University of Calgary, 
in Alberta, Canada, that’s part of the appeal of combinatorial 


games: “One of the things that intrigues me most is that there are 
so many places where things seem to become almost completely 
chaotic.” 

What makes Grundy’s Game, and others like it, analyzable at 
all? After all, by the time there’s a billion coins, the game tree for 
Grundy’s Game makes the tree for chess look like a mere twig. The 
difference is that Grundy’s Game has a very simple structure. Once 
a pile has been split, the resulting two piles have no further inter- 
action. This means the game decomposes into a “sum” of games. 
And that makes it subject to some simple, yet powerful, algebraic 
tools. 

Much of the theory was worked out in the 1930’s by Roland 
Sprague and Patrick Grundy, who studied what are called “impar- 
tial” games. These are games in which the same options are open 
to either player throughout the game—unlike chess, where if 
you’re playing White, you're not allowed to move the Black pieces. 
Chess is an example of what combinatorial game theorists call 
“partizan” games. 

The Sprague-Grundy theory enabled Guy to analyze Grundy’s 
Game by hand into the thousands in the 1950's. In 1973, Elwyn 
Berlekamp, a computer scientist at the University of California at 
Berkeley, extended Guy’s analysis (by machine) into the hundreds 
of thousands. Berlekamp’s results revealed properties of Grundy’s 
Game—and others like it—that have enabled combinatorial game 
theorists to streamline the analysis so that it can be extended com- 
putationally even further. 

The theory of combinatorial games took off in the 1970's, with 
the discovery by John H. Conway, then at Cambridge University, in 
England, of a deep connection between numbers and games. 
Conway, who is now at Princeton University, developed a theory of 
partizan games. In the process he showed that, surprisingly, games 
are more fundamental than numbers. In particular, the familiar sys- 
tem of “real” numbers can be viewed as a collection of games— 
rather simple ones, at that. Conway’s theory embeds the real num- 
bers in a larger system, which Donald Knuth, a computer scientist 
at Stanford University, dubbed the “surreal” numbers. 

The surreals include both infinitely large and infinitesimally 
small numbers. (Newton and Leibniz would have found the 
surreals extremely useful.) Martin Kruskal, a mathematician at 
Rutgers University, has embarked on a project to develop a “surre- 
al” theory of calculus. He thinks the theory may pay off in areas 
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billion coins, but 
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Figure 5. Black wins this checkers 
endgame, regardless of who has the 
next move. 
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such as mathematical physics, where “ordinary” calculus gets stuck 
on certain formulas which yield numbers that go off to infinity. 
Because the surreal system deals with infinitely large quantities on 
an equal basis with finite quantities, the limitations of calculus may 
be overcome in the surreal setting, Kruskal says. 

But numbers—even the surreals—occupy just a tiny corner of 
the universe of games, which includes non-numerical games with 
fanciful names like “Star” (*), “Up” ({), and “Tiny-two” (+2). 
Star, for example, is a game in which only one move can possibly 
be made, so that the first player can’t help but win—as in Grundy’s 
Game played with only three coins. In Up, on the other hand, one 
player, say Leo, always wins on his first turn, regardless of who 
plays first. Such a situation can occur in checkers, where the objec- 
tive is to eliminate your opponent's pieces and one of the rules is 
that jumps are forced (see Figure 5). 

Two games can be “added” together by letting each player, on 
his or her turn, make a move in one of the two games. The game 
*+ |, for example, is another first-player win: If Leo goes first, he 
makes the only move in *, forcing Rachel to make her only move 
in |; if Rachel goes first, though, she can win by making her open- 
ing move in |—this guarantees her the final move (and, by defini- 
tion, victory) in the sum, regardless of which move Leo makes. 

Noam Elkies, a mathematician at Harvard University, has used 
these elementary games to analyze some chess endgames. In one 
example (see Figure 6), the first player to move his king is easily 
beaten, so the game essentially boils down to playing for the last 
pawn move on the two edges of the board. (This position actually 
occurred in tournament play, in 1929, with White making the win- 
ning move.) If Leo is playing Black, Elkies has shown, the right 
side turns out to be equivalent to | + 7 +», The left side, although 
it looks much more complicated, is just |, the opposite of “Up.” 
Since “Up” and “Down” cancel each other out, the total game is 
just | +*. This means that whoever gets to make the first move 
can win, 

Berlekamp has gone even further with the Japanese game Go. 
Go is played with black and white “stones” on a 19 x 19 grid; the 
objective is to surround portions of the board. Berlekamp has used 
the theory of combinatorial games to create positions in Go that are 
so subtle they have stumped some of the game's best players (see 
Box, “A Cool Game of Go”). 


Combinatorial game theorists have invented hundreds of new 
games, some of which the theory can analyze completely. But 
many, like Grundy’s Game, remain tantalizingly out of reach. But 
the subject is just getting started. What line of play will the theory 
itself pursue? No one knows, says Guy: “It’s so vast a subject that 
it’s impossible to point in the directions that it'll go when it does 
take off.” 


Figure 6. /t doesn't matter whether 
you're playing Black or White, you can 
win this endgame as long as the next 
move is yours, (From “On Numbers 
and Endgames: Combinatorial Game 
Theory in Chess Endgames,” Noam D. 
Elkies. In Games of No Chance, 
Richard J. Nowakowski, ed.) 
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Figure 7. Where should White put the next stone? Even professional Go players 
have a hard time finding the right move! (Reprinted with permission from 
Mathematical Go: Chilling Gets the Last Point, Elwyn Berlekamp and David 
Wolfe, A. K. Peters, Ltd., Natick, MA.) 
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Go players rely more 
on rules of thumb than 
exact analysis. 
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Elwyn Berlekamp. (Photograph by Eliot Khuner, www. khuner.com. ) 


Figure 8. The opening move in 7-coin Split moves 4 coins to the right and 3 to the left. After that the game gets inter- 
esting. 
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Figure 9. The game tree for 6-coin Split shows it’s a win for the second player, but only if he makes the right moves 
(purple branch lines). 
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No More Split Ends? 


The game of Split is played with piles of coins on an end- 
less row of squares. Players take turns picking a square with 
at least two coins and splitting them equally between the two 
adjacent squares, moving the extra coin to the right if there 
is an odd number (see Figure 8, page 17). The game ends 
when no square has two or more coins. To win is to make 
the last move. 

Figure 9 shows the game tree for Split when the game 
starts with six coins on one square. The second player has a 
winning strategy, indicated by the purple branches, 
Technically speaking, the figure is not a tree because several 
branches have “rejoined,” but this is just a convenience to 
simplify the picture. The simplification is closely related to 
the way transposition tables speed up the search in comput- 
er chess: If the same position results from two moves made 
in either order, there is no need to repeat the subsequent 
search. 

The reader is invited to analyze 7-coin Split, or a variant, 
suggested by Conway, in which the rule is to move any equal 
number of coins left and right from any square with two or 
more coins. (For example, starting with 6 coins on one 
square, there are three possible opening moves, shown in 
Figure 10.) A curious feature is that the winner is deter- 
mined roughly halfway through the game, regardless of how 
he or she plays from then on—a potentially appealing alter- 
native to chess, where careless play in the endgame can be 
fatal. 


KI & 
EO a KIT 
> 27° mon 


Figure 10. Conway's variant of Split has multiple options from the very beginning. Who wins when you start with 6 
coins? Who wins with 666 coins? 
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“Random” Gaps. The statistics of nearest-neighbor spacings range from random to uniform (<'s indicate spacings 
too close for the figure to resolve). The second column shows the primes from 7,791,097 to 7,791,877. The third col- 
umn shows energy levels for an excited heavy (Erbium) nucleus. The fourth column is a “length spectrum” of period- 
ic trajectories for Sinai billiards. The fifth column is a spectrum of zeroes of the Riemann zeta function. (Figure cour- 
tesy of Springer-Verlag New York, Inc., “Chaotic motion and random matrix theories” by O. Bohigas and M. J. 
Giannoni in Mathematical and Computational Methods in Nuclear Physics, J. M. Gomez et al., eds., Lecture Notes in 
Physics, volume 209 (1984), pp. 1-99.) 
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A Prime Case of Chaos 


s one of the deepest problems in number theory tied to one of the 

most difficult subjects in modern physics? More and more 
researchers from both disciplines think—indeed, hope—so. The 
possibility has led to a surprising collaboration of physicists pursu- 
ing the implications of quantum chaos and mathematicians hunting 
for a proof of a famous conjecture in number theory known as the 
Riemann Hypothesis. 

Quantum chaos is a relative newcomer in theoretical physics. 
“It’s a problem of extreme mathematical subtlety and complexity,” 
says Michael Berry, a “quantum chaologist” at the University of 
Bristol. “There’s a lot that's understood, and a lot that’s not under- 
stood. One particular thing that’s lacking is a good model which 
has the essence of the general behavior.” 

As its name implies, quantum chaos combines two revolution- 
ary developments in modern physics: quantum mechanics and 
chaos theory. Quantum mechanics is an esoteric theory of particles 
and waves, developed by physicists in the 1920's to account for a 
body of mysterious phenomena that had turned up when experi- 
mentalists started exploring the tiny world of atoms. It is current- 
ly viewed as the “true” description of nature, usurping the position 
held for 250 years by Isaac Newton’s “classical” mechanics. Its 
rules are very precise and its results have so far proved highly accu- 
rate, but much of quantum mechanics still seems counterintuitive, 
even to physicists steeped in the subject. 

Chaos, on the other hand, is a familiar phenomenon. Its rise to 
prominence in the 1980's stemmed from the realization that disor- 
derly systems tend to be disorderly in an orderly fashion—and that 
the underlying order within chaotic phenomena can be revealed by 
a combination of mathematical analysis and computer simulations. 
Mathematically, chaos is caused by “nonlinearities” in the equa- 
tions that describe a dynamical system. “Nonlinearity” means that 
changes in output are not necessarily proportional to changes in 
input (see Figure 1). Not every nonlinear system is chaotic, of 
course; if that were the case, science would have essentially no pre- 
dictive power. But where chaos does occur (or seems to occur)— 
in long-range weather patterns, for example—it puts a limit on 
what can be predicted, and consequently calls into question the 
usefulness of the equations. 


input 


Figure 1. Jn a nonlinear system, 
change in output is not in proportion to 
change in input. 
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Quantum chaos is 
seemingly a 
contradiction in terms. 
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Classical mechanics abounds in nonlinear systems that exhibit 
chaos. The “double pendulum” is one such system. The classical 
picture of electrons orbiting an atomic nucleus is another. In com- 
puter simulations, the equations of these systems produce solutions 
that seem to wander aimlessly about, even though everything about 
the problem is completely deterministic. 

You might expect that the weirdness of quantum mechanics 
would only compound the chaotic aspects of classically chaotic 
systems. But that’s not what happens—what happens is even more 
bizarre. 

In a nutshell, quantum mechanics forbids chaos. Quantum 
chaos is seemingly a contradiction in terms. 

That's because quantum mechanics is inherently linear, At the 
heart of quantum mechanics is the notion that everything in nature 
is a sum, or superposition, of certain “base states.” The base states 
(also called eigenstates) behave in very simple ways, and the 
behavior of the system is just the sum of these simple components. 
The only things that are subject to change are the relative propor- 
tions of the various eigenstates in the superposition. These changes 
are made by a mathematical object called an operator—so called 
because it acts, or operates, on the states of the system. (When 
there are only finitely many base states, operators are also called 
matrices.) The whole theory is set up so that changes in output are 
proportional to changes in input. There’s no nonlinear crack 
through which chaos can slip in. 

Or so it would seem. 

But in fact quantum chaos does exist. It’s just a little different 
from what it first sounds like. 

An essential feature of any quantum system is a set of numbers 
called eigenvalues, which characterize the simple behavior of the 
eigenstates. For example, in the fundamental equation of quantum 
mechanics known as the Schrédinger equation, the eigenvalues are 
positive real numbers that correspond to energy levels. The eigen- 
state with smallest eigenvalue is called the ground state, the one 
with next smallest eigenvalue is called the first excited state, and so 
forth. Mathematically, there are infinitely many such states, each 
with its own eigenvalue. The collection of eigenvalues is called the 
spectrum of the system. 

For some systems—say an electron trapped inside a rectangular 
box—the spectrum is easy to calculate. But these are systems 
whose classical analogues are non-chaotic. The electron in a box 


Trace Elements 


Physicist Martin Gutzwiller at IBM’s T. J. Watson 
Research Center didn’t know he was doing number theory in 
the 1960's. But neither did Atle Selberg know he was setting 
the stage for quantum chaos in the 1950's. 

Selberg, a native of Norway now at the Institute for 


Advanced Study in Princeton, New Jersey, was studying 
number-theoretic implications of the analytic structure of 
certain curved spaces. (Curiously, Selberg’s most famous 
result is an “elementary” proof of the Prime Number 
Theorem—a proof that does not use any analytic properties 
of the Riemann zeta function. He and the Hungarian math- 
ematician Paul Erdés found the first elementary proofs in 
1948.) Selberg derived an equation that had the eigenvalues 
of a differential operator on one side and the lengths of 
closed curves in the space on the other side. This equation, 
he showed, “encodes” the number-theoretic properties that 
underlie the structure of the curved space. 

Selberg’s equation has come to be known as the Selberg 
Trace Formula. Number theorists have written hundreds of 
papers and devoted dozens of conferences to understanding 
and generalizing it. For Gutzwiller, it represents a Rosetta 
stone connecting quantum chaos with rigorous results in 
mathematics. 

Dennis Hejhal, an expert on the trace formula at the 
University of Minnesota and the University of Uppsala in 
Sweden, has conducted extensive numerical experiments, 
computing eigenvalues and closed curves for various curved 
spaces. In one of his early calculations, Hejhal thought he 
might be on to a proof of the Riemann Hypothesis: Among 
the eigenvalues for one of Selberg’s trace formulae, he saw 
numbers that he recognized as zeroes of the Riemann zeta 
function. Closer inspection revealed that they really were 
zeta zeroes. Moreover, the calculation was producing all of 
the zeroes of the zeta function. Alas, it was all a mirage. The 
zeta zeroes had crept into the computation by accident: The 
curved space had a certain “singular” point which required a 
minor adjustment in the differential operator; once the 
adjustment was made, the zeta zeroes disappeared from the 
spectrum. 

Selberg, who recently turned 80, says he has never seri- 
ously tried to prove the Riemann Hypothesis. He’s worked 
on the problem of the zeroes, he says, “but always with a 
somewhat more limited goal in mind.” In fact, he adds, 
“there have probably been very few attempts at proving the 
Riemann Hypothesis, because, simply, no one has ever had 
any really good idea for how to go about it!” 


Atle Selberg. (Photo courtesy of the 
Institute for Advanced Study.) 
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Figure 2. Trajectories on a standard, rectangular billiard table (a) are 
easy to analyze. The case of “stadium billiards” (b) — (d) is provably 
chaotic. (Figures (c) and (d) courtesy of Steve Tomsovic.) 
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behaves much like an ideal billiard ball on a perfect pool table: The 
eigenstates of the electron correspond to periodic trajectories of the 
billiard ball, and for a rectangular table, these are easy to compute 
(see Figure 2a). Put the same electron inside a “stadium”—a rec- 
tangular box capped at each end by semicircles (see Figure 2b)— 
and all hell breaks loose. That’s because the classical dynamics of 
a billiard ball rattling around inside such a shape is inherently 
chaotic. 

Roughly speaking, quantum chaos is concerned with the spec- 
trum of a quantum system when the classical version of the system 
is chaotic. Physicists such as Berry and John Keating, also at the 
University of Bristol, Achim Richter at the Technical University of 
Darmstadt, and Oriol Bohigas at the University of Paris believe— 


WHAT'S HAPPENING IN THE 
MATHEMATICAL SCIENCES 


and have begun to marshal evidence—that there is a subtle interplay 
between the mathematical description of chaotic classical systems 
and the spacing of energy levels in their quantum-mechanical coun- 
terparts. 

In the 1960's, Martin Gutzwiller, a physicist at the IBM 
TJ. Watson Research Center in Yorktown Heights, New York, pro- 
posed that classical chaos and quantum-mechanical eigenvalues are 
related by an equation called a trace formula. On one side of the 
equation is a combination of numbers representing the lengths of 
closed orbits in the classical system. On the other side is a combi- 
nation of the eigenvalues of the quantum-mechanical version of the 
system, Gutzwiller’s theory is that the two sides of the equation, 
which use such different data, really are equal. 

This equality has profound consequences. By studying closed 
orbits in a purely classical (albeit chaotic) setting, physicists can 
deduce properties of a quantum-mechanical system without having 
to solve the system’s Schrédinger equation—which is, in general, 
beyond any computer's ability anyway. In particular, the trace for- 
mula has implications for the spacing between energy levels: For 
many systems, it “discourages” eigenvalues from being too close 
together. When physicists draw histograms of the distribution of 
energy spacings, they find a curve quite different from the familiar, 
exponentially decreasing “Poisson” distribution (see Figure 3). 

Gutzwiller’s theory of the trace formula didn’t come entirely out 
of the blue. Physicists have used various types of trace formulas 
over the years; Gutzwiller’s formula itself was derived from an 
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approach to solving the Schrédinger equation known as Feynman’s 
path integral (introduced by the physicist Richard Feynman). What 
Gutzwiller didn’t know—at the time—was that mathematicians 
had also developed an extensive theory of trace formulas for a com- 
pletely different purpose: number theory. 

Mathematicians have long been fascinated by the existence of 
prime numbers—numbers such as 2, 3, 5, etc., which have no fac- 
tors except themselves and 1. One of the oldest facts about primes 
dates back to Euclid, around 300 BC: There are infinitely many of 
them. Euclid’s theorem is easily proved (see “Euclid’s Finest: A 
Prime Time Proof”), but it leaves another question unanswered: If 
you look only at the numbers up to, say NV, how many primes do 
you find? 

Number theory reached the first of many peaks in 1896, with a 
proof of the celebrated Prime Number Theorem, which states that 
the number of primes up to any number N is approximately 
N/1\nN, where In N is the natural logarithm of NV. The Prime 


Poisson NDE 
1726 spacings 


Figure 3. The nearest-neighbor spacings for a nuclear data ensemble 
(NDE) are similar to the statistics of a set of random matrices called the 
Gaussian orthogonal ensemble (GOE). The Poisson distribution is shown 
for comparison. (Figure courtesy of Oriol Bohigas, Institut de Physique 
Nucleaire, Orsay, France, and Kluwer Academic Publishers, O. Bohigas, 
R. U. Haq, and A. Pandey in “Nuclear Data from Science and Technology, 
K. H. Bockhoff, ed., Reidel, Dordrecht 1983, Figure 1, p. 809. With kind per- 
mission from Kluwer Academic Publishers.) 
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Number Theorem was proved independently by the French mathe- 
matician Jacques Hadamard and the Belgian mathematician 
Charles-Jean de la Vallée Poussin. The formula N/ In N is good to 
within about 5%, for example, in estimating the number of primes 
up to a billion. (The formula predicts about 48 million; the exact 
count is 50,847,534.) 

The proof of the Prime Number Theorem actually implies other, 
more complicated formulas which are even more accurate. The 
main one says that the number of primes up to N is approximately 
equal to what's known as the logarithmic integral of N, which is 
defined as the area beneath the curve y = 1/|Inx between x = 2 
and « = N (see Figure 4). A succinct way to state the Prime 
Number Theorem is 7(V) = Li(N), where 7 means “the number 
of primes” (up to V) and “Li” is short for “logarithmic integral.” 
(The wavy equal sign denotes an approximation. In this case, it 
means that the ratio 7(.V)/ Li(N) gets closer and closer to 1 as N 
gets larger and larger.) If JN is a billion, this approximation is off 
by only three and a third thousandths of a percent. 

Computational evidence suggests that Li(N) is an extremely 
good approximation to 7(V). Mathematicians have theoretical 
reasons to believe that the relative error incurred by the logarithmic 
integral is bounded by some constant multiple of (In V)?/ VN. 
But despite a century of intense study, they don't have a proof that 
the error actually stays that 
small. 

What stands in the way is 
a “minor technicality” 
known as the Riemann 
Hypothesis. In mathemat- 
ics, such seemingly 
insignificant details are 
often the acorns from which 
mighty theories grow. 
That’s certainly been the 
case for the Riemann 
Hypothesis. 

The Riemann Hypothesis 
concerns a mathematical 
object called the Riemann 
zeta function. The zeta func- 
tion is defined by summing 


Figure 4. The number of primes up to N can be estimated by measuring the 
area beneath the curve y = 1/Inz. 
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inverse powers of the positive integers: 
C(s)=1°4+254+3%+4%4--- 


For example, ¢(2) = 1+ i +5+ 45 +--+, This sum, a bit sur- 
prisingly, turns out to equal 77/6. The zeta function itself was 
introduced in the eighteenth century by the Swiss mathematician 
Leonhard Euler (who found the value 7/6 for ¢(2)), but the name 
Riemann has stuck because the nineteenth-century German mathe- 
matician Bernhard Riemann first showed that the properties of the 
zeta function are intimately intertwined with the distribution of 
prime numbers. 

Riemann, who was primarily a mathematical physicist and 
geometer (he created much of the mathematics that later went into 
Einstein’s theory of general relativity) wrote only one paper on 
number theory, an eight-page memoir published in 1859, titled 
Uber die Anzahl der Primzahlen unter einer gegebener Grosse 
(“On the Number of Prime Numbers Less Than a Given 
Magnitude”). In his paper, Riemann took Euler’s infinite series and 
applied to it the then-new tools of complex analysis. He showed 
that the Prime Number Theorem could be proved by deriving cer- 
tain complex properties of the zeta function. In short, Riemann set 
the stage for Hadamard and de la Vallée Poussin—but also for a 
century of mathematics to follow, 

Riemann’s key contribution was an equation linking two ways 
of looking at the zeta function. Euler had discovered one side of 
Riemann’s equation. He had shown that the zeta function—initial- 
ly defined as a sum over the positive integers—can be rewritten as 
a product over the prime numbers: 


1 
(1 — 2-8)(1 — 3-*)(1 —578)(1 — 7-8) -- 


(s) = 


What Riemann observed (and Hadamard later proved) was that 
the zeta function can also be written as a product over its zeroes in 
the complex plane: 


¢(s) = f(s)(1 — 8/1)(1 — 8/p2)(1 — s/ps)--- , 


where pi, /2, etc. are the complex numbers for which ¢(p) = 0, 
and f(s) is a fairly simple “fudge factor.” Riemann showed that by 


Riemann’s key 
contribution was an 
equation linking two 
ways of looking at the 
zeta function. 
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equating the logarithms of these two expressions for the zeta func- 
40: tion, it’s possible to derive not just the approximation Li(N) for the 

number of primes up to NV, but a whole sequence of increasingly 

accurate approximations—in effect, an exacr formula for 7(V). 


30: Riemann’s equation implies that the distribution of prime num- 
bers can be “heard” in the zeroes of the zeta function, much as a 
201 musician can “hear” the sound of a symphony by looking at the 


notes on a printed score. The zeta function, it turns out, has one 
batch of zeroes at the negative even integers, —2, —4, —6, etc. 
Number theorists refer to these as the “trivial” zeroes of the zeta 
function. 

What's interesting are the “nontrivial” zeroes. 

Riemann showed that all the nontrivial zeroes lie in a particular 
part of the complex plane: the infinite strip lying above and below 
+ the unit interval from s = 0 to s = 1 (see Figure 5). Furthermore, 
| the zeroes are symmetrically arranged, so that each zero above the 
ul 
it 


107 


unit interval has a mirror image below, and each zero (if any!) in the 
right half of the strip has a mirror image in the left half. Riemann’s 
paper includes a formula that estimates the number of zeroes that 
lie above the real axis and below any given height. 

It turns out that the Prime Number Theorem is equivalent to the 
assertion that no (nontrivial) zero of the zeta function lies on the 


—40i 
boundary of the “critical strip.” Furthermore, the relative error of 
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Figure 5. Zeroes of the Riemann zeta function (dark dots, top) lie in a strip. The Riemann Hypothesis says they all lie 
on the vertical line through s=1/2. The graph 1o4 + it)| is plotted (bottom) for t between 1 and 60. 
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the approximation 7(V) ~ Li(N), depends only on how close to 
the boundary the nontrivial zeroes ever get. And here is where 
things get dicey. 

As far as anyone knows, there may be zeroes of the zeta func- 
tion extremely close to the boundary. If that’s the case, then the rel- 
ative error term in the Prime Number Theorem is rather large. (It 
doesn’t mean the theorem is wrong, of course; it just means that the 
logarithmic integral only guarantees a good approximation to 
a(N) when N is extremely large.) If, on the other hand, all the 
zeroes are as far from the boundary as they can be—that is, if they 
all lie on the “critical line” above and below s = 1/2—then the rel- 
ative error term is as small as it can possibly be, which turns out to 
be the constant multiple of (In N)?/VN which we spoke of earli- 
er. 

In his paper, Riemann called it “very likely” that all the zeroes 
of the zeta function lie on the critical line. This is the celebrated 
Riemann Hypothesis. 

If true, the Riemann Hypothesis has profound implications not 
just for the error term in the Prime Number Theorem, but in many 
other parts of mathematics. The mathematical literature is rife with 
theorems that say “If the Riemann Hypothesis is true, then...” 
Many other published papers go to great lengths to prove theorems 
that would be simple corollaries of the Riemann Hypothesis. 

Harold Stark, a number theorist at the University of California at 
San Diego, jokes that a proof of the Riemann Hypothesis “would 
make lots of my life’s work irrelevant.” Nevertheless, he says, “the 
idea that you can presumably correctly conjecture that infinitely 
many numbers are on a particular line, and you can’t prove it, is 
frustrating beyond any description. It’s just unacceptable!” 

But what's any of this got to do with quantum chaos? The 
answer lies in the two ways of looking at the zeta function: 
Riemann’s equation is a trace formula. Physicists think it’s the 
mother of all trace formulas for quantum chaos. They believe the 
zeroes of the zeta function can be interpreted as energy levels in the 
quantum version of some classically chaotic system. If they’re 
right, the Riemann Hypothesis must be true. 

There’s a lot more than wishful thinking and soft analogies 
behind this line of reasoning. Mathematicians have collected a vast 
amount of data on the zeroes of the zeta function. So far, the data 
show every indication of perfect agreement with the statistics of 
energy levels in quantum chaos. That's part of the appeal for 
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In his paper, Riemann 
called it “very likely” 
that all the zeroes of 


the zeta function lie on 


the critical line. 


WHAT’S HAPPENING IN THE 
MATHEMATICAL SCIENCES 


31 


physicists: The eigenvalues of the Schrédinger equa- 
tion for actual quantum systems are hard to compute, 
and extremely hard to measure experimentally, so many 
of the predictions of quantum chaos are difficult to test. 
But zeroes of the zeta function are relatively easy to 
come by; number theorists can churn out tens of mil- 
lions of them on demand. 

The chief supplier of zeta zeroes is Andrew Odlyzko 
at AT&T Labs in Florham Park, New Jersey. Odlyzko 
has developed extremely efficient algorithms that make 
it possible to compute hundreds of millions of zeroes at 
‘Andrew’ Mi| Odlyaboull( PROIIeomnran op Moe various “heights” in the critical ao, Bicedlesss to say, 
Odlyzko.) they've all, so far, been found to satisfy the Riemann 
Hypothesis. (A single counterexample would be 
enough to ruin the Riemann Hypothesis. It would also 
end much of the interest physicists have in the zeta func- 
tion.) 
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Actually, Odlyzko’s computer has reported a number of coun- 
terexamples, but every one (so far) has been traced to a hard- or = single counter- 
software glitch. “The Riemann Hypothesis is very sensitive to any example would be 
kind of error,’ Odlyzko explains. “The slightest mistake, especial- enough to ruin the 
ly with the methods I use, produces counterexamples.” Riemann Hypothesis. 

In purely mathematical terms, Odlyzko’s computations indicate 
that the spacings between consecutive zeroes of the zeta function 
behave, statistically, like the spacings between consecutive eigen- 
values of large, random matrices belonging to a class known as the 
Gaussian Unitary Ensemble (see Figure 6). It was in precisely this 
context that the zeta function first caught the eye of physicists, in 
the early 1970's. 

Random matrices had been proposed as a way of studying the 
energy levels of large nuclei. The idea originated with Eugene 
Wigner, a theoretical physicist at Princeton University. (Wigner is 
best known to mathematicians for his 1960 essay, “The 
Unreasonable Effectiveness of Mathematics in the Natural 
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Figure 6. A comparison of the 
statistics of spacings between 
conecutive eigenvalues of ran- 
dom matrices (solid curve) 
and nearest-neighbor spac- 
ings of zeroes of the Riemann 
zeta function (data points). 
The fit is not bad for the first 
million zeroes (left), but near- 
ly perfect for the spacings 
among 1,041,600 zeroes near 


1.0 the 2 x 107th zero. (Figure 
courtesy of AT&T.) 
normalized spacing 
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Sciences.” The zeta—chaos connection would seem to be a case in 
point.) More recently Oriol Bohigas and colleagues at the Institut 
de Physique Nucléaire in Orsay, France, have brought the random- 
matrix interpretation under the umbrella of quantum chaos. But 
during the 1960's, the mathematics of random matrices was inten- 
sively studied in its own right by theoretical physicists, including 
Freeman Dyson at the Institute for Advanced Study in Princeton, 
New Jersey. 

In 1972, Hugh Montgomery, a number theorist at the University 
of Michigan, was visiting the Institute for Advanced Study. 
Montgomery had been studying the distribution of zeroes of the 
zeta function, in hopes of gaining insight into the Riemann 
Hypothesis. He was able to prove that the Riemann Hypothesis had 
implications for the spacing of zeroes along the critical line, but his 
key discovery was an additional property that the zeroes seemed to 
have, one which implied a particularly nice formula 
for the average spacing between zeroes. 

During tea one day at the Institute, Montgomery 


Odlyzko’s computations agree was introduced to Dyson and described his conjec- 


amazingly well with 
Montgomery’s conjecture. 
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ture. Dyson immediately recognized it as the same 
result as had been obtained for random matrices. 

“Tt just so happened that he was one of the two or 
three physicists in the world who had worked all of 
these things out, so I was actually talking to the great- 
est expert in exactly this!” Montgomery recalls. 

Odlyzko’s computations agree amazingly well with 
Montgomery’s conjecture, notes Peter Sarnak, a number theorist at 
Princeton University. But they do much more than that, Sarnak 
insists: They are “the first phenomenological insight that the zeroes 
are absolutely, undoubtedly ‘spectral’ in nature.” Riemann himself 
would be impressed, Sarnak says. 

Sarnak and Nicholas Katz, also at Princeton, have found addi- 
tional evidence for a spectral interpretation of the zeta zeroes, The 
Riemann zeta function is just one of many number-theoretic zeta 
functions, each of which has an analogue of the Riemann 
Hypothesis. Sarnak and Katz have analyzed in detail the relation 
between random matrices and a class of zeta functions for which 
the Riemann Hypothesis has actually been proved. (Unfortunately, 
the proofs for these zeta functions seem to offer no clue as to a 


proof of “the” Riemann Hypothesis.) They have shown that the 
key statistics for the spacing of eigenvalues and zeroes of their zeta 
functions are exactly the same. 

Of possibly greater importance, the key statistics for eigenvalues 
are relatively robust: Sarnak and Katz have shown that the statis- 
tics are the same for several different classes of random matrices. 
That’s good news for physicists. It suggests that the assumptions 
they make in choosing one class of random matrix over another 
don’t affect the properties they're interested in. 

Despite the stunning advances linking Riemann’s zeta function 
to twentieth-century physics, no one is predicting an imminent 
proof of the Riemann Hypothesis. Odlyzko’s numerical experi- 
ments and the evidence amassed by physicists have convinced 
everyone that a spectral interpretation of the zeta zeroes is the way 
to go, but number theorists say they are at least one “big idea” away 
from even the beginnings of a proof. Mathematicians 
aren’t yet sure what to aim at, says Sarnak: “What we 


really have to do with the Riemann Hypothesis is put Despite the stunning advances 

it in the ballpark.” linking Riemann’s zeta function 
Of course that was how things stood not long ago _¢g twentieth-century physics, no 

for Fermat's Last Theorem (see What’s Happening in gone is predicting an imminent 


the Mathematical Sciences, volumes 2 and 3). In the proof of the Riemann 
early 1980°s no one had any idea how to prove Hypothesis. 


Fermat's famous marginal remark. Then, in the mid 
80's, number theorists discovered a deep connection 
between Fermat’s Last Theorem and a branch of mod- 
ern mathematics known as the theory of elliptic curves. Within ten 
years, this discovery had culminated in Andrew Wiles’s solution of 
the 350-year-old problem. 

Interestingly, the denouement of Fermat’s Last Theorem was 
preceded by a conference, held in 1980, on recent developments in 
the mathematics of Fermat’s Last Theorem. Number theorists are 
hoping that history repeats itself: In 1996, the recently founded 
American Institute of Mathematics (AIM) sponsored a conference 
devoted to the zeta function and its connections with quantum 
chaos. If everything goes according to “plan,” the first break- 
through idea will come in 2001, and a proof of the Riemann 
Hypothesis will be published in 2011. Stay tuned. 
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Paul Erdos. (Photo courtesy of George Csicsery, from his documentary film, N is 
a Number: A Portrait of Paul Erdés.) 
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Proof by Example: A 


Mathematician’s Mathematician 


mathematician,” the mathematician Paul Erdés often said, 
“is a device for turning coffee into theorems.” 

Erdds knew whereof he spoke. Over a career spanning seven 
decades, Erd6s, who died in 1996 at age 83, turned an ocean of cof- 
fee into a sea of theorems, publishing more than 1500 papers. He 
ranged over areas as diverse as approximation theory and combi- 
natorics. His research in graph theory, especially, anticipated—and 
then helped further—the use of mathematics in computer science. 
Although Erdos never even learned to use e-mail, he may be as 
responsible as any other person for the explosive growth of com- 
puter networks and their already profound effects on daily life. 

Erdés himself was the hub of a worldwide network. He collab- 
orated with hundreds of mathematicians in dozens of countries 
over the years. Many thousands more flocked to his lectures and 
worked on problems he posed. 

Everyone who knew Erdés misses him now. Janice Malouf, 
who met Erdos when she was a graduate student at the University 
of Illinois, puts it as well as anyone: “I think he was my best friend 
in a lot of ways—just someone to go see and play with at meet- 
ings. He was just a real gentle man, a wonderful child.” 

“Its hard to overemphasize what a warm and wonderful and 
friendly person he was,” adds Melvyn Nathanson of the City 
University of New York. The key to his personality was simple, 
Nathanson says: “He liked to talk mathematics with people.” 

Joan Hutchinson agrees. Hutchinson, now at Macalester 
College in St. Paul, Minnesota, recalls meeting Erd6s soon after 
she had received a Ph.D. “I was just bowled over that this famous 
Hungarian would be willing to walk and talk with me, just right 
from the start,” she says. “It was so stimulating, to a young person 
in the field, to immediately be included by this person who was so 
completely natural and open and excited about mathematics.” 

“Paul was very generous,” explains Ron Graham, a longtime 
Erd6s collaborator at AT&T Labs in Florham Park, New Jersey, 
who calls Erdés the “Johnny Appleseed” of mathematics. Erdos 
gave not only his time and energy, Graham notes, he also put his 


Over a career spanning 
seven decades, Erdos 
turned an ocean of 
coffee into a sea of 
theorems 
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money where his mouth is, offering cash prizes for solutions of his 
favorite problems. 

But maybe most valuable was the example Erdés set. 

“His life was an inspiration,” says Joel Spencer, a mathematician 
at the Courant Institute for Mathematical Sciences. “Here was 
somebody who was truly brilliant and who had enormous energy, 
and yet he was devoting his life to this quest in mathematics.... The 
rest of us would say, well if he’s putting this enormous effort into 
it, we can at least try to put in half as much.” Working with Erdés, 
Spencer says, was “the most amazing thing that’s ever happened, or 
ever will happen, in my life.” 

Pal Erd6s was born on March 26, 1913, in Budapest, Hungary. 
His parents were both teachers. The young Erdos quickly attracted 
attention for his mathematical ability. He published his first paper, 
on a result in number theory, at age 19, while he was a student at 
the University of Hungary. Two years later he received his Ph.D. 
and took a position at the University of Manchester in England. In 
1938, he came to the United States. Denied a re-entry visa during 
the McCarthy era, he spent five years in western Europe, travelling 
on a special Hungarian passport. He was finally allowed back into 
the United States, but remained a perpetual world traveler, the 
Flying Dutchman (or Hungarian) of mathematics. 

Erdos specialized in using elementary methods to solve difficult 
problems. The term “elementary” can be misleading. 
“Elementary, as used in mathematics, is not the same as simple;” 
notes Spencer. In general, an elementary proof is one that does not 
rely on advanced results from other parts of mathematics. “An ele- 
mentary proof uses basic ideas that pretty much all mathematicians 
understand,” Spencer explains. “The distinction between an ele- 
mentary proof and a non-elementary proof is not in its simplicity, 
but in what kind of techniques it uses. The proof itself may be quite 
involved, covering a lot of pages and including a lot of clever 
ideas.” 

Looking for an elementary proof is like taking a walk rather than 
driving a car: The car may be faster, but hiking has its own 
appeal—and there are lots of interesting places that you can reach 
only on foot. “That's a good analogy,” Spencer says, “but I would 
hasten to add that Erdés got awfully far by foot!” 

The Erdés “style” was to pose problems. “Erd6s was very much 
problem oriented,” says Béla Bollobas, a mathematician at the 
University of Memphis and a Fellow of Trinity College at 


Cambridge University. Problems always came first. Instead of 
concocting problems that are easily solved by a particular theo- 
rem—a standard way of establishing the utility of a new result— 
Erdos would start with a problem, and challenge himself and oth- 
ers to find the theorem that would solve it. 

“The problems were designed to make you reach a little bit fur- 
ther than you thought you could before,” Graham explains. “Paul 
was very good at capturing the next obstacle in a very explicit way.” 

Over the years, Erdés paid thousands of dollars for solutions to 


problems he had posed. The amounts varied from a few dollars to 
$1,000. “It wasn’t like winning the lottery,” says Hutchinson. 
Erdos himself liked to point out that, considering the amount of 
work required, his payments were well below minimum wage. “It 


Paul Erdés and Béla Bollobas. (Photo courtesy of George Csicsery, from his documentary film, N is a Number: A 
Portrait of Paul Erdés.) 
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meant more to solve the problem and be recognized by him,” 
Hutchinson explains. “Sometimes he’d give people a $10 bill. 
They'd get him to autograph the bill, and then they would keep it 
forever.” 

“The excitement when you were able to solve an Erdos problem, 
or solve some interesting mathematical problem inside the Erdos 
circle, was just incredible,” recalls Spencer. 

Erdés was also a prolific co-author—‘the master of collabora- 
tion,” says Jerrold Grossman at Oakland University in Rochester, 
Michigan. More than two-thirds of his papers were joint efforts 
with one or more mathematicians from a pool of over 500 co- 
authors. Erd6s’s penchant for collaboration inspired the definition 
of “Erdés number”: His 500-some co-authors have Erdés number 
1, their co-authors have Erd6s number 2, and so on. At last count, 
over 5000 people worldwide have Erdos number 2, and the count 
continues to grow, as the ErdOs-number-! crowd attracts new co- 
authors. 

“He had this childlike curiosity which really was infectious,” 
says Malouf. “He was constantly talking about mathematics and 
playing mathematics with people.” 

In mathematics generally, notes Grossman, the percentage of 
jointly written papers has risen from under 10% in the 1940's to 
nearly 50% today (see Figure 1). For Erd6s’s collaborators, the 
percentage is even higher (see Figure 2). “People who have writ- 
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Figure 1. The percentage of mathematics papers with a single author has decreased steadily since the incep- 
tion of Mathematical Reviews on 1940. (Figure courtesy of Jerrold Grossman and Utilitas Mathematica 


Publishing Inc.) 
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ten with Erdés have a higher tendency to write joint papers,” 
Graham observes. “Now which was the chicken and which was the 
egg?” 

E-mail has certainly contributed to the growing popularity of 
collaboration. But Erd6s preferred a more direct approach: con- 
stant travel. With no permanent home, Erdés visited hundreds of 
colleagues in dozens of mathematics departments around the world 
every year. Nor did he restrict himself to the top research univer- 
sities; his peregrinations were as likely to lead him to Baton Rouge 
or Boca Raton as to Boston or Berkeley—wherever there were 
people who shared his mathematical interests. 

“Whenever somebody would invite him to a place, whether it 
was a small place or a big place, he would immediately say yes.” 
says Hutchinson, who estimates she saw Erdos two or three times 
a year at conferences. “He loved to come to different schools and 
‘preach’ at different places.” And he always had the latest news on 
the newest results, she says. “He operated almost as fast as e- 
mail!” 

“Every time he visited, there would be this enormous spurt of 
activity,” Spencer recalls. “It was just staggering that he could 
work with so many people. It was really a very special style that 
he had. Discrete mathematics has this very cooperative air to it, 
and I think a lot of that owes to his influence.” 

Erdés kept a notebook of problems that he had discussed with 


Figure 2. Mathematicians with Erdés number 1 often collaborate with each other as well. The height of the bar 
above each value k is the number of Erdés's coauthors who have collaborated with k other Erd6s coauthors. 
(From “The Erdés Number Project” website, www.acs.oakland.edu/~grossman/erdoshp.himl.) 
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Figure 3. With five points (or fewer), 
it’s possible to color the lines connect- 
ing them so that no three points are 
connected by lines of the same color. 
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the mathematicians he visited, and he would review it before arriv- 
ing for another visit, Graham explains. “Before showing up at a 
particular place that he had visited, say last year, he would read 
over the things he had been thinking about and wrote down then 
and the people he talked to, so that when he got there he could pick 
it right up,” Graham explains. 

He also knew which problems would interest which of his col- 
laborators. As Richard Guy, a mathematician at the University of 
Calgary, puts it, “Erdés not only asked the right questions, he asked 
them of the right person. He knew better than you yourself what 
you were capable of.” 

His own interests spanned a wide range. Erdés left his mark on 
set theory, number theory, graph theory, combinatorial geometry, 
and probability theory, to name just a few. 

For many, including Spencer and Bollobés, Erd6s’s greatest 
legacy is his invention of what's called the probabilistic method in 
discrete mathematics. The idea is to prove the existence of a com- 
binatorial or number-theoretic structure with a certain property by 
showing that if the structure is chosen at random, there is a positive 
probability that it has the desired property. Often, this probability 
turns out to be quite large. But any positive probability, no matter 
how small, implies there's at least one such structure. 

Others had occasionally used probabilistic arguments in combi- 
natorial proofs, but it was Erdés who championed the method. He 
used it first, in 1947, in a subject known as Ramsey theory. A key 
result in Ramsey theory says that if you have a large enough num- 
ber of points, N, and color each line segment joining a pair of 
points either, say purple or black, then no matter how you do the 
coloring, it’s always possible to find a subset of some pre-deter- 
mined number of points, k, for which all line segments joining 
pairs of points in the subset have the same color. The question is, 
given k, how large must N be? For example, if k = 3, N must be 
at least 6 (see Figure 3). 

Erd6s showed that if (/!) < 22)" then N has to be larger than 
n. This doesn’t say much for small values of k (if k = 3, for 
instance, it says N is larger than 3), but it provides a meaningful 
lower bound when k is relatively large. The proof is remarkably 
simple. There are 2(2) ways to color the line segments in any sub- 
set of k points, but only two of those ways are single-colored, so if 
the line segments are colored at random, the probability of any 
given subset being single-colored is 2/23), But the number of k- 
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member subsets is ( es so the probability that a random coloring 
includes a single-colored subset of size k is at most ('!) - 2/2(2). If 
(2) < 2G), this probability is less than 1, so some fraction of the 
random colorings do not include a single-colored subset of size k. 

For example, if k = 10 and n = 100, ('\)’) +5 x 10! while 
a2) /2 = 2" = 18 x 10!*. Thus it’s possible to color the line seg- 
ments connecting 100 points without creating any single-colored 
subset of the lines on 10 points. In fact, it’s been shown (by other 
methods) that you can avoid such a subset even when you have as 
many as 797 points! 

Finding the exact number N at which it becomes impossible to 
avoid a single-colored subset of lines on k points is much more dif- 
ficult. The answer is known in just two cases: For k = 3 it’s 6, and 
for k =4 it’s 18. When k = 5, the “Ramsey number” is conjec- 
tured to be 43, but it’s only known to be somewhere between 43 and 
49 (see Figure 4). For k = 10, the range is even larger: at least 
798, and at most 23,854. 

One could, of course, settle the conjecture for k = 5 by exhaus- 
tively checking all the different colorings of the line segments con- 
necting 43 points. But there are 2 2) ~ 1027 such colorings, so a 
brute-force search is out of the question. And things are obviously 
even worse for a case like k = 10. 

Erd6s’s 1947 theorem implies that the Ramsey number for each 
k exceeds 2'/*. The importance of the theorem, Spencer explains, 
is less in what it says about particular values of k, such as k = 100 
(let alone k = 3), but rather in the “asymptotic” nature of the 
result: It gives a bound that holds for arbitrarily large values of k. 

An interesting feature of Erdés’s proof, notes Spencer, is that 
although it establishes the existence of a coloring of the line seg- 
ments connecting 2'/? points without a single-colored subset of k 
points, it offers no recipe for finding such a coloring, other than to 
try random colorings until one works. Surprisingly, researchers 
have yet to find a recipe that leads straight to a solution. 

The probabilistic method and other Erdés innovations have pro- 
foundly changed not only mathematics, but also computer science. 
Indeed, “he probably has a greater influence in computer science 
than he does in mathematics,” notes Nathanson. “His ideas, meth- 
ods, and techniques pervade a large part of theoretical computer 
science.” 

“A lot of his combinatorial questions have moved over into 
analysis of algorithms and computer science.” Spencer explains. 
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at least at most 


Figure 4. Ramsey numbers are known 
for k =3 and 4. For larger k, only 


upper and lower bounds are known. 
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motivated by his 
internal aesthetic 
sense.” 
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“There are some natural reasons: Algorithms themselves tend to 
be discrete processes. Also, the basic paradigm for studying algo- 
rithms in computer science is asymptotic analysis. If you talk 
about sorting, you describe it as an ‘N log N algorithm’. So in the 
theoretical sense, you're not interested in how much time it takes 
to sort 100 elements or 200 elements; you're interested in N ele- 
ments, and you're interested in the asymptotics of that function as 
N goes to infinity. This kind of asymptotic analysis is central to 
theoretical computer science. It was also central to Paul’s combi- 
natorial investigations.” 

Erd6s’s influence in computer science is, in one respect, para- 
doxical: Erd6s himself had little interest in computers. “Paul was 
always just motivated by his internal aesthetic sense,” says 
Spencer. “He did mathematics because he felt that these were 
interesting problems.” 

“Of course,” he adds, “Paul had an incredible sense about what 
were interesting problems.” 


Simple Arithmetic 


Here’s your chance to make three grand the hard way. 
One of Erdés’s largest cash offers was for the following 
conjecture. 

Suppose a; < a2 < ay < ... is an increasing sequence 
a pose integers, with the property that 

o+2 +2 +--»=o00. Must every such sequence 
include arithmetic progressions of all lengths? 

An arithmetic progression is a list of numbers that 
ratchet up by a constant amount: 4, 7, 10, 13 is an exam- 
ple. The problem does not require the terms in progression 
to appear consecutively in the sequence (otherwise the 
sequence I, 2, 4, 5, 7, 8. 10, 11,... would be a counterex- 
ample). 

The conjecture is unresolved even in certain special 
cases, most notably for the sequence of prime numbers. 
The Swiss mathematician Leonhard Euler proved in 1737 
that $+ 444 4+--=00, but nobody knows 
whether there are arbitrarily long arithmetic progressions 
of primes. 

It would be a breakthrough, notes Ron Graham, even to 
prove that every sequence with atatet =0o 
contains three-term arithmetic progressions. So sharpen 
your pencils. But don’t give up your day job. 


Figure 5. One of Erdés’s “mathematical playgrounds.” Random lattice graphs can be created by drawing edges 
between pairs of nearby points with probability p. Shown are examples for p = 0.4, 0.5, and 0.6. The largest con- 
nected component grows dramatically at p = 0.5. The second largest component for p = 0.5 is also quite large. 
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Geometric Reasoning. The tangent lines at three points of an ellipse form a triangle in the plane. When the vertices 
of the triangle are connected to the points of tangency, the three lines intersect at a single point. Can you prove it? 
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Computers Take Algebraic 
Geometry Back to Its Roots 


M people’s recollection of high-school algebra centers on 
the quadratic formula: that mantra of a’s, b’s, and c’s for 
finding the “roots” of a polynomial of degree 2. Like the 
Pythagorean equation a? + b? = c? or Einstein’s famous E = me”, 
the expression 


— —b+ Vb? — dac 
= 2a 


is instantly, if distantly, recognizable. Even for those who don’t 
remember what it’s for, the quadratic formula evokes Proustian 
memories of chalkboards and chewing gum and notes passed back 
and forth. 

Computer memories work a bit differently: no lime-flower tea or 
madeleines for them. That’s too bad, because mathematicians are 
calling on machines to do more and more of what's called compu- 
tational algebraic geometry. A suite of sleek new algorithms based 
on sophisticated theorems is enlivening a branch of mathematics 
that many had counted as moribund, and opening up applications 
previously considered unapproachable. 

The computer revolution has reawakened interest in developing 
general methods that systematically handle all instances of alge- 
braic equations, no matter how complicated. Machines make it 
possible to take on equations that are “well beyond what you can 
do by hand,” says David Cox at Amherst College in Amherst, 
Massachusetts. What the machines need are algorithms. 
Developing the requisite algorithms and making them practical 
calls for advances in the theory underlying them. 

Algebraic geometry already had a “magnificent theory,” Cox 
notes. With the new algorithms, he says, “you have a powerful the- 
ory, and you can compute powerful examples.” 

Computational algebraic geometry has also spurred the develop- 
ment of additional theory, creating new connections between alge- 
braic geometry and branches of mathematics as diverse as 
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of mathematics that 
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topology and combinatorics. Applications of the algorithms have 
been found in areas ranging from integer programming to partial 
differential equations. 

“It’s an indication that mathematics is more unified than people 
realized,” says John Little at the College of the Holy Cross in 
Worcester, Massachusetts. Some of the discoveries “are just so 
unexpected it sort of takes your breath away,” adds Cox. 

Algebraic geometry is pretty breathtaking to begin with. The 
subject has a long and distinguished history. It got started by the 
seventeenth-century French mathematician René Descartes, who 
showed how to translate systematically back and forth between 
geometric objects and algebraic equations. Lines, for example, 
become equations of the form ax + by = c. Circles turn into equa- 
tions like x? + y* = r?. And upright parabolas find their algebra- 
ic Doppelgiinger in the equation y = ax* + br + c. That's where 
the quadratic formula comes in: It locates the two points (if there 
are two) where the parabola pierces the c-axis (see Figure 1). 


y=r?+a-1 


Figure 1. A parabola crosses the x-axis no more than twice. The 
quadratic formula says exactly where the roots are. 
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Descartes is immortalized by the term “Cartesian coordinates,” 
which refers to the numbers that specify a point (xr, y) in the plane. 
(By happy coincidence, carte is French for map or chart.) Part of 
the beauty of Descartes’s creation is that it generalizes easily to an 
algebraic description of geometric objects in three-dimensional 
space, and beyond. Even when the shapes are impossible to pic- 
ture—say the intersection of two 9-dimensional “hypersurfaces” in 
10-dimensional space—the equations can still be written down, 
examined, and manipulated using the rules of algebra. 

Cartesian coordinates support all sorts of equations, including 
“transcendental” expressions like sin y = e*. But algebraic geom- 
etry is primarily concerned with polynomial equations, especially 
polynomials with rational coefficients. 

A great many problems, both theoretical and applied, boil down 
to studying systems of such equations. For example, researchers in 
robotics use polynomial equations to describe the constrained 
motion of a mechanical arm: Variables are associated with the 
angles at joints and with the lengths of various segments (see 
Figure 2). Similarly, chemical engineers frequently write down 


Figure 2. A robot arm reaches over a barrier to pick up a ball. The range of motion can be described by a system of 
polynomial equations and inequalities, with variables representing the x and y coordinates of the robot's joints and 


fingers. 
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systems of polynomial equations that describe the interrelated reac- 
tion rates of chemical compounds in a complex environment. 

When the equations are “linear’’—that is, when the variables x, 
y, etc. are never raised to powers or multiplied together—the solu- 
tions are relatively easy to find. The key technique is known as 
Gaussian elimination (named for the nineteenth-century German 
mathematician Carl Friedrich Gauss). The basic idea is to order the 
variables, say alphabetically, and then re-express the system of 
equations, by taking appropriate “linear combinations,” so that the 
first variable appears only in the first equation, the second variable 
appears only in the first and second equations, the third variable 
appears only in the first three equations, etc., down the line until 
you run out of equations. A process known as “back-substitution” 
then produces an explicit solution to the system of equations, start- 
ing with the last variable and working its way back to the first. 

Linear algebra is a powerful tool in the mathematician’s arsenal, 
because many problems are inherently linear, and for many that 
aren't, the nonlinear terms are small enough to be ignored. But 
some problems, including many of the most important problems in 
modern science and technology, are unavoidably nonlinear. How 
can you go about solving them? 

For years, mathematicians got by with ad hoc methods of solv- 
ing systems of higher-degree polynomial equations, treating each 
problem essentially anew. There seemed to be no single method, 
analogous to Gaussian elimination, that would apply in all cases. 
Instead, notes Little, algebraic geometers relied on shortcuts and 
clever tricks to find solutions. “When those clever tricks and short- 
cuts weren't available, they just couldn't do the computations,” 

Mathematicians didn’t so much despair at being unable to solve 
such problems as simply refuse to think about them. The theory of 
algebraic geometry grew in other, more abstract directions through 
most of the twentieth century. By the 1970°s, the subject had wan- 
dered far from its roots in Cartesian coordinates. Indeed, jokes 
Cox, “I learned algebraic geometry without ever seeing an exam- 
ple!” 

But the advent of unprecedented computing power has brought 
the seemingly mundane task of manipulating equations back into 
the limelight. Even inexpensive home computers can run sophisti- 
cated computer algebra systems, which do more than crunch num- 
bers. These systems crunch symbols too, manipulating equations 


SEE 


much as students are taught to do in algebra class. With two impor- 
tant differences: Computer algebra systems are a lot faster than stu- 


Computer algebra 
systems are a lot faster 
than students, and they 
make far fewer 
mistakes. 


dents, and they make far fewer mistakes. 

All that such a system needs is a theory of higher-degree 
Gaussian elimination. 

Enter Bruno Buchberger. 

In 1965, Buchberger, who is now chairman of the Research 
Institute for Symbolic Computation at the University of Linz, in 
Austria, wrote a Ph.D. thesis titled “On finding a vector space basis 
of the residue class ring modulo a zero-dimensional polynomial 
ideal.” In it, he established a systematic approach to solving sys- 
tems of polynomial equations. In 1970, he nailed down the first 


practical algorithm implementing the approach. The procedure is 


Bruno Buchberger. (Photo courtesy of Bruno Buchberger.) 
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now known as Buchberger’s algorithm. 

The key ingredient in Buchberger’s algorithm is the notion of 
what is now called a “Grébner basis.” (Buchberger introduced the 
name in honor of his thesis advisor, Wolfgang Grébner, in 1976, 
“when I understood the significance of the problem he had given to 
me!”) A Grébner basis is analogous to the simplified list of linear 
equations produced by Gaussian elimination: Each equation in the 
list is “simpler” than the one above it, until you get down to the bot- 
tom, where the last equation often involves just a single variable. In 
many cases one can back-substitute the roots of the last equation, 
and thereby work out the complete set of solutions of the original 
system of equations. (Finding the roots explicitly, as in the qua- 
dratic formula, is sometimes possible, but in general this last step is 
often best done using numerical methods.) 

“Grébner bases, in many ways, really are high-school algebra,” 
says Cox. 

The underlying theory, though, is at a somewhat higher level. 
For one thing, there’s a catch in what it means for one equation to 
be “simpler” than another. In the case of Gaussian elimination for 
linear equations, simplicity simply refers to the “disappearance” of 
variables: x vanishes after the first equation, y after the second, z 
after the third, etc. For nonlinear equations, not only do the vari- 
ables need to be ordered, but also all their powers and products. 

“Lexicographic” ordering is one way of doing this. Roughly 
speaking, lexicographic ordering lists terms as they would appear 
in a dictionary: Terms with powers of x come first, followed by 
terms with no x but a power of y, and so forth. There are many 
other ways to order things, however (see Figure 3). In fact, one 
area of current research is in trying to discern when to use what 
kind of ordering in order to calculate a Grébner basis most effi- 
ciently. 

That's a serious concern, because in calculating a Grébner basis, 
it often happens that the elimination of a particular term, say x?y, 
may leave “simpler” equations of horribly high degree with, say 
terms like z?2!*. By the time you've gotten rid of all the x's, you 
may be facing equations with grotesqueries like ry*!42?7'5__ In 
other words, Buchberger’s algorithm is guaranteed to terminate 
eventually, but it offers no promise as to how long things might 
take. 

That’s not a fault in the algorithm. Solving systems of algebra- 
ic equations is intrinsically complex, Buchberger notes. “We can 


prove that any algorithm that solves the problem in its generality— 
even the ones which we do not yet know!—must have a certain 
minimal complexity in the worst case.” This aspect of computa- 
tional algebraic geometry “lies in the nature of the mathematical 
problem.” 

“Still” Buchberger adds, “even if we know that a problem is 
intrisically complex, it is very reasonable to continue research on 
better algorithms that may perform much better in certain special 
classes of the problem or may perform better by reducing the *con- 
stant in the exponent’,” a notion from complexity theory regarding 
the rate at which computational effort grows—or explodes—with 
problem size. 

“One of the obstacles is understanding what the real complexity 
of the problems is,” says Little. “There are these drastically bad, 
worst-case examples where the computation of a Grébner basis is 
basically doubly exponential in the size of the inputs” —a level of 
computational complexity that makes the so-called “hard’’ prob- 
lems of computer science look trivial by comparison (see “Tales 
from the Cryptosystem, page 97). Even in the best cases, a Grébner 
basis can look much more complicated than the original system of 
equations (see Box, “Ideal Ideas in Algebraic Geometry.” page 55.). 

Among the algorithm’s applications is computer-aided geomet- 
ric design (CAGD), which uses algebraic geometry to describe 
curves and surfaces that arise, for example, in automobile manu- 
facturing—those sleek new contours on the latest model cars all 
come from equations. Computer graphics are, almost by definition, 
based on CAGD. Even the “non-graphic” world of computer text 
uses algebraic geometry: What’s an “S” if not a curve? 


(a) 4a? z + 6x25 +20 + Sy + 7y?2? +3z4+1 
(b) 6x2? + Ty?2? + da?z + Sy) + 27 +3241 


(c)  Ty2? + 6rz* + Sy? + 4a72 +22 +32z4+1 


Figure 3. Three ways to order the terms of the polynomial 
14 2x +324 4a22 + 5y3 + 6x23 + 7y22?: (a) lexicographic (lex) 
order, (b) graded lex order, and (c) graded reverse lex order. 
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Computer-aided geometric design is dominated by quadratic 
and cubic equations, notes Thomas Sederberg, a computer scientist 
at Brigham Young University. “Ninety-nine percent of the models 
are degree three or less,” he says. That’s handy for the current 
practical limitations of computational algebraic geometry. Using 
the new algorithms, CAGD researchers can, for example, take 
parametric equations that describe two surfaces and derive an exact 
expression for their intersection, or put things in a form that facili- 
tates accurate numerical approximations. 

Algebraic geometry has also worked its way into the theory of 
error-correcting codes. Enrror-correcting codes use a variety of 
mathematical techniques to safeguard information in computer 
storage and transmission (see “Straightening Out Linear Codes,” 
What's Happening in the Mathematical Sciences, Volume 2). They 
made it possible, for example, for Pathfinder to send crisp pictures 
of Mars’s surface across millions of miles of interplanetary static. 
At a more mundane level, error-correcting codes enable compact 
disks to play perfect music even after they’ve been scratched. 

Advances in computer hardware are allowing for the use of 
increasingly sophisticated error-correcting codes, giving greater 
error protection at less computational cost. Researchers have 
found that algebraic geometry offers a way of constructing codes 
that, at least in theory, approach the limits of error-correcting effi- 
ciency. 

“A lot of the interesting codes have algebraic structure of their 
own,” explains Little. “The set of code words might be an ideal in 
a ring or a module over a ring. The computational methods give 
you ways to construct encoding and decoding algorithms for these 
codes. In the best cases, they're actually the best algorithms that 
are known.” 

One particular class of codes based on algebraic curves, devel- 
oped in the 1970's by the Russian mathematician V. D. Goppa, is 
looking increasingly attractive. “In terms of error-correcting capa- 
bility, there are Goppa codes that are even superior to the 
Reed-Solomon ones,” which are currently the mainstay of error- 
correcting technology, Little says. “What's held them back from 
being applied is that people haven’t been able to construct good 
decoding algorithms yet. But that’s in the works.” 

The computational approach to algebraic geometry has even had 
an impact on theoretical work unrelated to computer applications. 
For example, the construction of Grébner bases “is now part of the 


standard arsenal of techniques that people use to prove theorems in 
algebraic geometry,” notes Cox. The ability to compute extensive 
examples has also enabled researchers to discover new facts about 
algebraic curves and surfaces. Indeed, many see the new develop- 
ments as realizing Descartes’s dream of an analytic ap- 
proach to geometry. Asked what the seventeenth-century mathe- 
matician—philosopher might make of computational algebraic 
geometry, Little replies, “He would have felt very much at home.” 


Ideal Ideas in Algebraic Geometry 


Grébner bases are an “ideal” way to work with poly- 
nomial equations—even when they look far from ideal. 

In algebraic geometry, each system of equations is 
associated with an algebraic object known as an ideal in 
the ring of polynomials. If, say, the equations are 


e+y+2?-1=0 
3x? + Qyz — 2ew =0 

2az — 2yw =0 
Qry — 2z —2zw =0 


then the ideal consists of all combinations of the four 
polynomials on the left. For example, the new polynomi- 
al 


P(w, x,y,z) = 2w(27+y? +2? -1) 
+a(3a? + Qyz — 2rw) + y(2rz — 2yw) 
+2(2xry — 22 — 2zw) 


is such a combination. The point is that every solution 
(w, x,y, 2) of the original system of equations is auto- 
matically also a solution of the new equation 
P(w,z,y,z) =0. 

The “coefficients” 2w, x, y, and z in this example 
could have been any polynomials; they were chosen so 
that P simplifies to —2w + 32° + 6ryz —2z*, which 
effectively “solves” for w in terms of the other variables: 
P =0 implies w = 32° + 32yz — 2. 


Researchers have 
found that algebraic 
geometry offers a way 
of constructing codes 
that, at least in theory, 
approach the limits of 
error-correcting 
efficiency. 


WHAT’S HAPPENING IN THE 55 
MATHEMATICAL SCIENCES és 


The search continues 
for efficient ways to 
find “‘good”’ Groébner 
bases. 


56 WHAT’S HAPPENING IN THE 
MATHEMATICAL SCIENCES 


The original equations are said to be a “basis” which 
“generates” the ideal. However, other systems of equa- 
tions can also be used as a basis. A Grébner basis has 
special properties that make it easy, for example, to say 
whether or not a given polynomial is in the ideal, or to 
find out how many solutions the system of equations has. 

For the example above (which is taken from /deals, 
Varieties, and Algorithms, an undergraduate text on com- 
putational algebraic geometry by David Cox, John Little, 
and Donal O'Shea, published by Springer-Verlag), one 
Grébner basis consists of the eight polynomials 
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This may look like a mess, but the last polynomial 
involves only 2, and in this case it can be solved exactly 
for its 7 roots: z=0, +1, +2/3, and +/22/16. By 
back-substituting these roots into the earlier equations, 
one finds there are precisely 10 solutions (w, 2x, y,z) to 
the original system of equations. 

Grdébner bases like this one may be too grubby to 
compute by hand, but computers love “em. And what 
researchers love is the ability to tackle a large class of 
problems with a single, uniform method. The search con- 
tinues for efficient ways to find “good” Grébner bases 
(there can be more than one) and solve other systems of 
equations which until recently were beyond the pale. 


Figure 4, Computers have a field day with geometric problems. 
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A Baker’s Dozen. The key steps in proving the Robbins conjecture, as reported by EQP, an automated 
theorem-proving program developed by William McCune and colleagues at Argonne National Laboratory. 
(See Box, “Substitute Teacher,” page 63 for details.) 
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As Easy as EQP 


ihe phrase il est aisé a voir—‘it is easy to see”—appears 

repeatedly in the treatise Méchanique céleste by the great 
French mathematician Pierre-Simon de Laplace (1749-1827). The 
phrase is common in mathematics, meant to relieve the author of 
spelling out details that come from a standard repertoire of mathe- 
matical techniques. But Laplace’s readers came to dread those 
words. What Laplace saw easily often took hours, if not days, of 
intense mathematical scrutiny by less gifted savants. 

Oddly enough, Laplace was correct: Once the reader finally did 
see how a particular line of reasoning went, the conclusion really 
was easy to see. The trick—and Laplace’s gift—was to find the 
right combination of ideas. 

Could a computer program outpace Laplace? One may already 
have. 

William McCune, a computer scientist at the Argonne National 
Laboratory in Illinois, has developed a suite of automated theorem- 
proving programs. McCune’s programs can return true—false ver- 
dicts on a range of problems in abstract algebra and symbolic logic. 
His most powerful program, named EQP (for “equational prover”), 
can be every bit as cryptic as Laplace: Its output omits steps the 
program considers easy. Following its reasoning can frustrate even 
the program’s creator. 

Cryptic or not, EQP has already proved its worth: In 1996, it 
announced a solution to the Robbins conjecture, a problem in sym- 
bolic logic that had stumped mathematicians since it was first 
posed in the 1930's. For the first time, a famous unsolved problem 
has been solved by a computer doing what can be legitimately 
described as reasoning on its own. 

“It’s a clear landmark” in automated theorem-proving, says 
Stanley Burris, a logician at the University of Waterloo in Ontario, 
Canada. “Now that we know that a computer can solve a difficult 
problem, it opens the door to a lot of interesting research.” 

The Robbins conjecture is a problem about Boolean algebras, 
which are named for the British mathematician George Boole 
(1815-1864), who developed a way of turning logical statements 
into algebraic expressions. A statement such as “P or Q” can be 
expressed as p+gq, while “P or not Q” can be written as 
p+ N(q); a logical expression is True or False depending on 
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whether the value of its Boolean formula is 1 or 0. (Boolean alge- 


Boolean algebra bra requires a peculiar interpretation of addition: 1 + 1 = 1.) The 
requires a peculiar “Not” function N simply flips values; it is sometimes written 
interpretation of N(p) =1-p. 

addition: 1+1=1. A Boolean algebra satisfies the usual commutative and 


associative laws of addition: pt+q=q+p_ and 
(p+q)+r=p+(q+r). Two other laws, or “axioms,” concern 
the function V. The first is p + N(q + N(q)) =p, and the second 


Boolean Basics 
The “standard” axioms of Boolean algebra are 


(1) p+q=q+p 

(2) (p+q)+r=p+(q+r) 

(3) p+N(q+N(q))=p 

(4) p+N(N(q)+N(r)) = N(N(p+4) + N(p +7) 


Everything else can be proved on the basis of these four axioms. 
For example, here’s how to derive the commonplace observation 
about double negatives, N(N(p)) = p: 


N(N(p)) = N(N(p)) + N(N(p) + N(N(p))) (3) 
= N(N(N(N(p)) +P) 
+N(N(N(p))+N(p))) (4) 
= N(N(p+ N(N(p))) 
+ N(N(p)+N(N(p)))) (1) 
=N(N(p+ N(N(p))) (3) 
= N(N(p+N(N(p)))+N(p+N(p))) (3) 
=p+N(N(N(p))+N(p)) (A) 
=p+N(N(p)+N(N(p))) (1) 
=p (3) 


McCune’s automated theorem-proving programs have 
no trouble finding such “simple” proofs (notice that asso- 
ciativity is never invoked!). Getting the same result with 
the Robbins axiom in place of axioms (3) and (4), howev- 
er, is something else again. 
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is p+ N(N(q) + N(r)) = N(N(p+ 4) + N(p+r)). These two 
axioms may appear obscure, but they correspond to rather simple 
logical statements. For example, N(N(q) + N(r)) is an algebra- 
ic way of saying “Q and R2” What’s important is that every other 
“Jaw” of Boolean algebra can be derived from these four axioms 
(see Box, “Boolean Basics”). 

In 1933, the American mathematician Edward Huntington 
showed that the last two axioms can be combined into one: 
p= N(N(p)+q)+N(N(p)+N(q)). Deriving the other 
axioms from this one (and the commutative and associative rules) 
is not at all easy, but Huntington showed how to do it. 

Herbert Robbins, then a student of Huntington and now a pro- 
fessor of mathematics at Rutgers University in New Brunswick, 
New Jersey, suggested an alternative to Huntington’s axiom: 
p= N(N(p+q) + N(p+ N(q))). Robbins’s formula is shorter 
by one N. But does it really replace Huntington’s formula—can 
the rest of Boolean algebra be derived from it? 

That was the $64,000 question for the next 63 years. 

The Robbins conjecture resisted numerous attacks. Alfred 
Tarski, a logician at the University of California at Berkeley, 
worked on it and posed it to many of his colleagues. “When peo- 
ple would visit Tarski at Berkeley, he would often give them some 
tantalizing problem,” says Burris, who was a research associate 
there in the late 1960's. The Robbins conjecture, Burris recalls, 
“was one that he liked to dish out.” 

In 1979, Steve Winker, a student who was visiting Argonne, 
learned of the Robbins conjecture from Joel Berman at the 
University of Illinois at Chicago. Winker told Larry Wos, his 
Argonne advisor, about it. Wos, a pioneer of automated theorem- 
proving (which he prefers to call automated reasoning), suggested 
Winker work “backwards” on the problem, looking for equations 
which, when combined with Robbins’s axiom, would be sufficient 
to derive the rest of Boolean algebra. Wos and Winker hoped that 
one of these equations might itself be derivable from the Robbins 
axiom. 

Winker found several equations of the desired type. But no one 
could see how to derive any of these equations from the Robbins 
axiom. 

McCune arrived at Argonne in 1980, and joined Wos’s team in 
building new theorem-proving programs. The group developed a 
program they dubbed Otter (the name, McCune explains, is based 
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Herbert Robbins. (Photo courtesy of 
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on a computer-science tradition of using animal names for Unix 
programs). In the early 1990's, an early version of EQP (unnamed 
at the time) rediscovered the equations that Winker had found ear- 
lier, by hand. Neither it nor Otter, however, had any luck with the 
Robbins conjecture itself. 

Could the conjecture be false? Researchers began to wonder. If 
the other axioms of Boolean algebra did follow from the Robbins 
equation, of course, a systematic search should, in principle, find 
their derivations. But what if the haystack had no needle? 

In a Ph.D. dissertation at Berkeley in 1972, George McNulty, 
now at the University of South Carolina, proved that asking 
whether the axioms of Boolean algebra are consequences of a given 
set of equations is, in general, an undecidable problem. In other 
words, there is no algorithm that is guaranteed to solve problems 
like the Robbins conjecture, even with unlimited time and memory. 
Could the Robbins conjecture itself be one of these undecidable 
problems? 

The Argonne team kept the Robbins conjecture on the list of 
problems against which to pit their programs. “Whenever we 
developed new techniques, we would try them out [against the 
Robbins equation] and see what happened,” McCune recalls. Over 
the years, the automated theorem-provers graduated from strug- 
gling with problems a high-school student would breeze through to 
knocking off complex equations that would challenge a profession- 
al mathematician. But the Robbins conjecture remained out of 
reach. 

Then, in September 1996, McCune fed the Robbins conjecture 
to his latest theorem-proving program, EQP. Like Otter, EQP tack- 
les problems in equational logic using a method called paramodu- 
lation. Essentially, paramodulation derives new equations from old 
by making substitutions (see Box, “Substitute Teacher”). This is 
similar to how people work with equations—except that EQP han- 
dies extremely long equations without making mistakes. The com- 
puter will also cheerfully explore thousands of dead-ends before 
finding a substitution that leads somewhere. 

Where EQP improves on Otter is that it doesn’t need to explic- 
itly invoke the commutative and associative axioms when it uses 
them; these axioms are “built in” using a procedure called asso- 
ciative-commutative (AC) unification. People do something simi- 
lar when they omit parentheses from expressions like 
2+9+3+8+7+1, automatically rearranging terms to 
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2+8+9+1+3+7. which adds easily to 40. AC unification 
prevents EQP from wasting time on trivialities, and helps the pro- 
gram spot promising combinations of terms within expressions. 

McCune thought EQP stood a chance against the Robbins con- 
jecture. Even so, he recalls, “I wasn’t expecting to find a proof.” 
The search took place through September and into October. EQP 
is not an interactive program, but as it searches it sends progress 
reports—derived equations and various statistics—to an output file. 
If the search doesn’t look promising, or “if [the computer] is doing 
something obviously stupid,” as McCune puts it, the user can tweak 
certain parameters of the program and try again. In all, McCune 
ran 14 searches on three separate computers, burning about five 
CPU-weeks of computer time. 

On October 10, success: EQP reported deriving one of Winker’s 
equations from the Robbins axiom. The derivation had taken eight 
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days, and involved 49,548 intermediate equations. The “proof” 
that EQP spat out, however, was only 13 lines long (see Box, This may be a proof 
“Substitute Teacher.” page 63.). that only a computer 

But was the proof correct? could find. 

“I didn’t believe it,’ McCune recalls. “Programs have bugs. 
And any proof is questionable—especially one found by a pro- 
gram.” 

McCune found himself in the same position as Laplace's read- 
ers: EQP claimed that each line of its proof followed from previous 
lines, and its output included clues to the steps, but nothing was 
spelled out in detail. “Each of those lines is pretty complicated,” 
McCune notes. “It took me hours to verify them.” 

Enlisting Otter and other programs—and also some of his human 
colleagues—to double-check EQP’s claims, McCune finally con- 
cluded there were no mistakes. EQP had succeeded where scores 
of brilliant human minds had failed. 

Indeed, this may be a proof that only a computer could find. 
Even after verifying the correctness of EQP’s derivation, McCune 
admits “I don’t understand the proof.” 

“I understand it from a syntactic point of view,” McCune 
explains, “but I get no mathemtical insight from it, and I don’t know 
of anyone who does.” There seems to be no recognizable pattern or 
strategy in EQP’s derivation, he says. “It just seems to be pushing 
symbols around.” 

Burris agrees. EQP succeeded where humans failed partly 
because machines don’t depend on inspiration as people do: 
Machines can simply juggle huge equations until something good 
happens. In an attempt to make the proof more reader-friendly, 
Burris produced what he calls an “anthropomorphized” version of 
EQP’s argument, introducing abbreviations for subformulas that 
recur throughout the derivation. The anthropomorphized proof is a 
chain of logic that really can be easily checked by hand. But even 
this version, Burris says, is not one that a person would ever dream 
up. “I have absolutely no understanding” of the proof, he confess- 
es. 

Brandon Fitelson, a graduate student in philosophy (with an 
undergraduate degree in mathematics) at the University of 
Wisconsin, is more optimistic about coming to grips with the 
Robbins conjecture. Fitelson has used the computer algebra system 
Mathematica to recast the proof in a reader-friendly, “two-dimen- 
sional” style which, he says, shows the shifting patterns in the 
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derivations. Two-dimensionality appears when one replaces the 
functional notation N(p) with an overline notation, D. Then 
nested negations appear as a “stack” of overlines, and 
those hard-to-parse _ parentheses disappear altogether: 
N(N(p+q)+(N(p)+q)) becomes simply p+q+p+q. 
Human brains still beat machines when it comes to processing 
visual information. 

EQP’s triumph trumps earlier computer-assisted proofs of noto- 
rious problems, such as the Four-Color Theorem. That theorem 
was dispatched in 1976, when a computer program checked thou- 
sands of cases for a possible counterexample. But the heart of the 
proof was an “ordinary,” human argument: that any counterexam- 
ple would have to crop up in what can be considered a fairly short 
computation. The program that “proved” the Four Color Theorem 
was written specifically for that purpose. EQP, on the other hand, 
can try any problem in equational logic. The program doesn’t 
guarantee success. Instead, it gropes about, using various strate- 
gies to guide it in presumably useful directions. Sometimes it fails, 
but more and more often it finds a way through the maze. 

Will mathematicians be out of work one of these days? Hardly. 
If anything, McCune says, automated theorem-proving programs 
will add value to mathematicians’ work by assisting with certain 
tedious aspects of the job, leaving humans free to tackle increas- 
ingly complex problems. 

Maybe the close reasoning that’s commonly associated with 
mathematical thought is analogous to the well defined procedures 
of arithmetic, says Burris. “As humans, we know that doing arith- 
metic is very tiring when the numbers get long. We prefer to have 
the machine do it for us. Perhaps we'll just prefer to have the 
machine do these very long things in math for us, eventually.” 


—_————— 


Yorktown Heights, New York, and Selman at 
Cornell University (previously at AT&T Bell 
Laboratories)—have found evidence that, much as water 
freezes at 32° F, a certain class of problems in logic under- 
goes a phase transition from easy to almost impossible when 
a “size” parameter crosses a threshold. 

The problems Kirkpatrick and Selman examined come 
from Boolean logic, and are known as “k-satisfaction,” or 
k-SAT, problems. In Boolean logic, variables take only two 
values—True and False—and they are combined using the 
ordinary “AND,” “OR,” and “NOT” operations, typically 
represented by the symbols /\, V, and ~, respectively. Given 
an expression such as (a A) V (y/ 2), the satisfaction 
problem is to determine whether there is an assignment of 
True/False values to the variables (in this case x, y, and 2) 
that will “satisfy” the expression—that is, make it true. This 
example allows lots of satisfying assignments, such as x 
True, y False, and z either True or False. 

A k-SAT problem has a special form. For example, 
(e@VyVZ)A(yVzVw)A(e@VGVB)ARVZVw) isa 
3-SAT expression: It consists of several “OR” clauses, each 
with 3 variables (or their negations), that are “AND-ed™ 
together. In general, a k-SAT expression can have an arbi- 
trary number of “OR” clauses “AND-ed” together, as long 
as each “OR” clause consists of k variables. 

Any single clause of a k-SAT problem is easy to satisfy: 
Just pick any one of its variables to be True or False, depend- 
ing on whether that variable appears with the “NOT” sym- 
bol. Such a selection often automatically makes other claus- 
es True as well. When there are only a few clauses in the 
complete expression, or when many different variables are 
used, it’s usually easy to pick assignments that satisfy the 
whole expression. On the other hand, when the number of 
clauses is much larger than the total number of variables, it 
can be difficult, and often impossible, to find a satisfying 
assignment: Choices that make one group of clauses True 
almost always make other clauses False. 

Kirkpatrick and Selman have found that for each value of 
k, the relevant parameter is the ratio \//N of clauses to vari- 
ables. The two researchers have run computer experiments 
on k-SAT problems with k from 2 to 6. In each simulation, 
they fix the total number of variables, NV. Then the comput- 
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ercreates M k-variable clauses, choosing randomly from 
among the N variables and their negations, and checks— 
using a search algorithm—whether each randomly created 
Boolean expression is satisfiable. Checking thousands of 
such expressions for a given choice of Mand N, the com- 
puter obtains an estimate for the probability of a random 
k-SAT expression with N variables and M clauses being 
satisfiable. The researchers then plotted this probability as 
a function of the ratio M/N (see Figure 1). 

What they find is a set of increasingly steep curves for 
each k. Moreover, except for the 2-SAT case, each set of 
curves cross at a precise “critical ratio”: M/N ~ 4.17 
for k =3, M/N ~9.75 for k=4, M/N = 20.9 for 
k=5, and M/N = 43.2 for k =6. These values are 
increasingly close to the corresponding values of 
log 2/log(1 — 2~*), which is a heuristic estimate for the 
value of M//N at which, on average, a random k-SAT 
expression is satisfied by just one assignment of 
True/False values to its variables, (On average, each 
clause reduces the number of satisfying assignments by a 
factor of 1 — 2~*, so the “expected number” of satisfying 
assignments for an M-clause, N-variable expression is 
2%(1—2-*)“_ Setting this equal to 1 and solving for 
M/N gives the aforementioned estimate.) 

The sudden change from predominantly satisfiable to 
predominantly unsatisfiable is strikingly similar to phase 
transitions that occur in physics. That’s what got 
Kirkpatrick, a physicist, interested in the problem. The 
researchers have found that techniques borrowed from sta- 
tistical mechanics closely model the way the probability 
curves steepen at the transition point as the number of 
variables and clauses is increased. In particular, a tech- 
nique known as finite-size scaling can be used to bring all 
the curves into accordance (see Figure 2). 

“The nice thing about finite-size scaling is that it ‘col- 
lapses’ experimental data from many sizes of system onto 
a single, universal curve,” Kirkpatrick explains. “At the 
same time, it proves that the curve sharpens up into either 
a step function or a sharp delta function spike.” 

Kirkpatrick and Selman found the k-SAT peak to 
occur at parameter values that produced equal fractions of 
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Figure 1. Phase transition for k-SAT, from k =2 to k= 6. (Figure courtesy of Scott Kirkpatrick and Bart 
Selman.) 
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Figure 2. Finite-size scaling of the k-SAT transitions. (Figure courtesy of Scott Kirkpatrick and Bart Selman.) 
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satisfiable and unsatisfiable formulas. Another insight has 
come from the work of physicists Rémi Monasson at the 
Ecole Normale Superieure in Paris and Riccardo Zecchina at 
the International Centre for Theoretical Physics in Trieste, 
Italy: They found that for 2-SAT, the transition is “continu- 
ous,” similar to the way iron stops being magnetic at its 
“Curie point” temperature, whereas for 3-SAT or higher, the 
transition is discontinuous, more like the way ice melts. 

The k-SAT problem is important because many search 
algorithms can be formulated in terms of satisfying Boolean 
expressions. In general, k-SAT belongs to a class of prob- 
lems that theoretical computer scientists call V P-complete; 
this suggests there may be no efficient algorithm for deter- 
mining whether any given expression can be satisfied. (One 
exception is 2-SAT, for which an efficient algorithm is 
known.) If there are inherently hard k-SAT problems, 
where are they? 

Kirkpatrick and Selman think the hard cases are concen- 
trated at the transition point. In earlier work with David 
Mitchell at Simon Fraser University and Hector Levesque at 
the University of Toronto, Selman studied the difficulty that 
one commonly used search algorithm, known as the 
Davis—Putnam procedure, encounters with 3-SAT. The three 
computer scientists found a peak in the number of “calls” 
the procedure makes—a rough indication of the amount of 
work the algorithm is doing—that becomes increasingly 
prominent as the formulas grow in size (see Figure 3). The 
peak is positioned near what Kirkpatrick and Selman have 
identified as the transition point for 3-SAT. 

More recently, Monasson, Zecchina, Kirkpatrick, 
Selman, and Lidror Troyansky, a computer scientist at 
Hebrew University in Israel, have investigated what happens 
“between” 2- and 3-SAT, studying formulas with a mixture 
of 2-clauses and 3-clauses. They've discovered another 
transition point: / ~ 2.413. For formulas with fewer than 
41.3% 3-clauses, the cost of finding a solution increases lin- 
early with the size of the formula; for those with more than 
41.3%, the cost grows exponentially. 

Knowing where the hardest problems are likely to occur 
doesn’t make them any easier to solve, but at least it gives 
you an idea what to expect. That way you can plan how long 
a lunch break to take while you let the computer plug away. 
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Figure 3. The difficulty of solving k-SAT problems appears to peak at the phase transition. (Figure courtesy of Bart 
Selman.) 
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Beetle Dynamics. Biologists and mathematicians have teamed up to study the population dynamics of flour beetles, 


a 
common pest in stored grain. In response to the riddle, “why do flour beetles cross the road?” their answer is, “They're 
attracted by a saddle point.” (Figure courtesy of Bob Constantino.) 
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Beetlemania: 
Chaos in Ecology 


t’s been a tenet of theoretical biology for the last 20 years that 
what look like random fluctuations in a population can instead be Flour beetle 
something else: chaos. Now, for the first time, there is experi- populations show all 
ret ite to wine point of view. An —— the classic signs of 
team of biologists and mathematicians has data to show that chaot- . . 
ic dynamics can describe the course of a population, at least in the chaotic dynamics. 
laboratory. 
Bob Costantino, Jim Cushing, Brian Dennis, and Robert 


Desharnais have experimented with a tiny critter known to scien- 
tists as Tribolium casteneum (and to the rest of us as the flour bee- 
tle) and compared their results with predictions from a mathemati- 
cal model. Their flour beetle populations, they conclude, show all 
the classic signs of chaotic dynamics, right down to such arcana as 
period-doubling bifurcations and strange attractors. The 
researchers described their findings in an article published in the 
journal Science (January 17, 1997). 

Costantino and Desharnais are experimental biologists at the 
University of Rhode Island and the California State University at 
Los Angeles, respectively, Cushing is a mathematician at the 
University of Arizona, and Dennis is a statistician at the University 
of Idaho. The four began collaborating in 1991. Their research has 


Beetle Maniacs. From left to right, Shandelle Henson, Jim Cushing, 
Robert Desharnais, Brian Dennis, and Bob Costantino. (Photo courtesy of 
Shandelle Henson.) 
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expanded into some of the finer points of chaotic dynamics (which 
Costantino appreciatively calls the “bells and whistles” of the sub- 
ject). And coming full circle, they are also taking another look at 
the role of random fluctuations—but this time coupled with chaos. 

Chaos and randomness are not the same thing. A random fluc- 
tuation is one that cannot be predicted, regardless of how much 
information you have—loosely speaking, the future depends on the 
toss of a coin. Chaotic systems, by contrast, are predictable, but 
only if you somehow have access to a huge amount of information. 

The hallmark of chaos is that minute differences grow exponen- 
tially into huge disparities. Chaos theorists call this “sensitivity to 
initial conditions.” It’s also known as the butterfly effect: The gen- 
tle flap of a butterfly’s wings on a summer afternoon could, con- 
ceivably, spell the difference between fair weather and a hurricane 
two months later. 

Why would anyone doubt that chaos exists? Ever since the sub- 
ject burst upon the scene in the late 1970’s and early 80's, scientists 
have found chaos a power- 
ful explanatory tool. Chaos 
theory—and, more general- 


Figure 1. The dynamics of the logistic 
equation vary from simple equilibrium 
to chaos. The plots correspond to ly, nonlinear dynamics— 
r = 2.9, 3.25, 3.51, and 4, with ini- has scored some signal suc- 
ee i= cesses in physics and chem- 
istry. New mathematical 
techniques and computer 
simulations have helped researchers explore in detail the complex 
interactions of particles, fluids, and chemical compounds. In ecol- 
ogy, by contrast, chaos theory has functioned largely as a caution- 
ary tale, with the moral that simple mathematical models of any 
sort may not capture the essence of complex biological systems. 
The classic formula in mathematical ecology is known as the 
logistic equation: x,,.; = rz,,(1—«,,). The logistic equation is a 
highly idealized description of how a population might fluctuate. 
The numbers x, 2), 12, ... represent the size of a population at 
successive stages (years, seasons, generations, etc.). More precise- 
ly, they represent fractions of a theoretical upper bound on the pop- 
ulation. For instance, x39 = 0.73 means that a population occupies 
73% of its niche at the tenth stage. The “growth parameter” r is a 
number between | and 4. (If r is less than 1, then the numbers z:,, 
rapidly approach 0, meaning the population dies out. If r is greater 
than 4, the z,,s can wander outside the range from 0 to 1, 


becoming meaningless in the intended ecological context.) 

Despite its apparent simplicity, the logistic equation yields some 
amazingly complicated scenarios. When the growth parameter r is 
fairly small—between 1 and 3—things proceed as an ecologist 
would expect: The population reaches an equilibrium state, usual- 
ly quite quickly (see Figure 1a). But for larger values of r, the pic- 
ture gets increasingly complicated. First, the equilibrium state 
splits, or “bifurcates,” into nwo alternating states (see Figure 1b). 
At higher values, these bifurcate again to give a cycle of four states 
(see Figure Ic). 

This phenomenon, called period doubling, occurs more and 
more often as the growth parameter increases. At a value of 
approximately r = 3.57, “chaos” sets in: The typical population 
no longer settles into any kind of cycle (see Figure 1d). The com- 
plexity of these dynamics is illustrated by the logistic equation’s 
“bifurcation diagram” (see Figure 2): above each value of r are 
plotted the values of x that belong to the system’s “attractor.” 
These attractors range from the single fixed point for small values 
of r to an infinite range of x for larger values of r, when chaos sets 
in. 

The strange behavior of the logistic equation was pointed out to 
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Figure 2. The bifurcation diagram of the logistic equation summarizes its complex dynamics. Each vertical slice of the 
diagram consists of the system’s “attractors” for the corresponding value of r. The purple curve shows the unique fixed 
point, which is unstable beyond r = 3. 
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old favorite of 
experimental 
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ecologists in the 1970's by Robert May, a mathematical biologist at 
Oxford University. Around the same time, Mitchell Feigenbaum, a 
theoretical physicist now at Rockefeller University, made a detailed 
study of the period-doubling phenomenon. He conjectured that 
certain key aspects of the way chaos develops in the logistic equa- 
tion are shared by all equations of a similar type. In other words, 
period doubling and chaos are not accidental or even unusual: 
They are the norm for a broad class of mathematical equations and, 
by implication, in many scientific applications. 

Feigenbaum’s “universality” conjecture was rigorously proved 
in 1981 by Oscar Lanford III, now at the Eidgendssische 
Technische Hochschule in Zurich, Switzerland. Through the 
1980's, the study of chaos became all the rage in the physical sci- 
ences, to the point that chaos was sometimes used to explain what 
used to be called acts of God. Mathematicians, too, took delight in 
exploring the wild and woolly world of nonlinear dynamics, 
prospecting for new forms of order amidst the chaos. 

But experimental biologists and ecologists never really warmed 
to the new point of view. “A lot of biologists still think in terms of 
the ‘balance of nature,’ says Costantino. According to this view- 
point, all populations tend toward equilibrium, deviating from it 
only in response to external events; wild population swings are due 
to identifiable—and thus, in principle, preventable—factors. 

The balance-of-nature adherents cannot simply be dismissed as 
mathematical Luddites, instinctively resisting change. Biologists 
are properly wary of the radical new viewpoint: “There are no data 
from natural populations that have been convincingly shown to be 
chaotic,” Costantino notes. 

But now, at last, there are clear signs of chaos in data from a lab- 
oratory population. 

Costantino and colleagues have been studying flour beetle pop- 
ulations under carefully controlled conditions. They have found 
remarkably good agreement between experimental results and the 
predictions of a logistic-type mathematical model. The group 
expects their findings to encourage more biologists to design exper- 
iments to test the predictions of nonlinear dynamics, and to look for 
further signs of chaos. 

The flour beetle is an old favorite of experimental biologists, 
partly because it’s easy and cheap to work with in the laboratory. 
But there are more compelling economic reasons to study flour 
beetles: They are among the most prevalent pests affecting the 


Model Behavior 


The demographics of beetles in a bottle can be modeled 
by a system of three equations: 


Inui = 6. Ane EAA cet ln 
Pri = Ly(1— rz) 
Ansa = Pre + An(1— 1a) 


The coefficients ce4, cer, and cp4 denote rates of can- 
nibalism of eggs by adults, eggs by larvae, and pupae by 
adults, respectively. The coefficient b represents the average 
number of larvae produced per adult, and the coefficients r, 
and r4 denote the mortality rates of larvae and adults. 
Typical values are cz4 = cet = 0.01, cpa = 0.005, b= 7, 
ry = 0.2, and r4 =0.01. If you like, each census predic- 
tion can be rounded to the nearest integer. 

With these values for the coefficients, the equations pre- 
dict that an equilibrium population is reached with around 40 
larvae, 30 pupae, and 400 adults. But artificially raising the 
adult mortality rate to r.4 = 0.96—the mathematical equiv- 
alent of spraying with pesticide—makes the dynamics sensi- 
tive to changes in the cannibalism rate cp4. A computer- 
drawn bifurcation diagram (see Figure 3, page 78) shows 
many of the same features as the corresponding diagram for 
the original logistic equation. 

One big difference, however, is the existence of multiple 
attractors in the flour-beetle model. For example, at values 
of cp, near 0.55, the 11 strands of the bifurcation diagram 
correspond to two separate attractors: a 3-cycle comprising 
the top three strands and an 8-cycle comprising the rest. And 
at cp4 = 0.50, the 3-cycle coexists with a chaotic attractor. 


stored grain industry. Flour mills spend millions discouraging the 
critter; these ubiquitous beetles have even been found in granaries 
entombed with the pharaohs. 

“It’s an outrageous pest,” says Costantino. “It’s probably the 
primary pest in all grains.” 

The flour-beetle experiments used populations of a few hundred 
beetles, kept in half-pint milk bottles and fed a mixture of wheat 
flour and dried brewer’s yeast. Flour beetles have a relatively sim- 
ple and, for experimenters, convenient life cycle: Each beetle 
spends about two weeks as a larva and two more as a pupa before 
becoming a reproducing adult. This schedule means that a fort- 
nightly census of adults, pupae, and larvae offers a good log of the 
beetles’ demographics. Mortality rates can also be artificially 


A fortnightly census of 
adults, pupae, and 
larvae offers a good 
log of the beetles’ 
demographics. 
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manipulated by simply adding or subtracting beetles at each 
census—if you want the population to think adults are dying at a 
high rate, for example, just don’t put them all back after you’ve 
counted them. 

Mathematically, a census consists of numbers L,,, P,, and A,, 
representing the count of larvae, pupae, and adults present at cen- 
sus number n. The researchers began by constructing equations to 
describe how the numbers of the nth census might determine those 
of the (n + 1)th census: Theoretical considerations suggested a 
system of logistic-type equations (see Box, “Model Behavior,” pre- 
vious page). The researchers then studied bottle-beetle popula- 
tions to estimate the numerical coefficients in their model. 

The next step was a mathematical analysis of the dynamical sys- 
tem. The system’s nonlinearity stems from a gruesome-sounding 
aspect of the flour beetle’s diet: cannibalism. If adults and larvae 
didn’t eat eggs, and if adults didn’t also feast on pupae, the popula- 
tion would be described by a simple linear model, and the number 
of beetles would explode exponentially, zooming into the billions 
within a year. Mathematically, cannibalism keeps the population in 
the hundreds. But the precise dynamics depends on the values of 
the model's coefficients. 


250 
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Figure 3. The bifurcation diagram for flour bettle populations, with adult mortality raised to 96%, is similar to that of 
the logistic equation (Figure 2), but also has important differences. The values of cp.4 chosen for experiments and the 
curious “cut-off” value \n(2)* 0.69 are indicated. 
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At “natural” values—meaning when the researchers keep their 
beetles supplied with nutrients and take a biweekly census—the 
population quickly reaches and maintains an equilibrium state, just 
as the balance-of-nature viewpoint expects. But the equations pre- 
dict that lots of other kinds of behavior occur at other, “artificial” 
values. In particular, the researchers found that raising the adult 
mortality rate to around 96% made the dynamics especially sensi- 
tive to variations in the rate of cannibalism of pupae by adults. 
Much like the logistic equation itself, the flour beetle model yields 
a complex bifurcation diagram as its cannibalism coefficient cp 
varies from 0 to 1 (see Figure 3). 

The final step was to see whether the model’s predictions are 
borne out in reality. To this end, Costantino has kept two dozen 
milk-bottled populations in a dark, climate-controlled laboratory 
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loops). (Reprinted with permission from “Chaotic Dynamics in an Insect Population,” Costantino 
et al., Science, volume 275, 17 January, 1997, pp. 389-391. Copyright 1997 American Association 
for the Advancement of Science.) 
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As with most experiments, the data are scattered around what the 
theory predicts, but statistical analysis of the scatter indicates the 
model is an accurate description of what the beetles are up to. 

“The data show that this laboratory system is in fact chaotic,” 
concludes Costantino. 

The group is now investigating details of the dynamics. In par- 
ticular, when the mathematical model has a chaotic attractor, it pre- 
dicts that data points will appear more densely in certain parts of the 
attractor than in others, corresponding to what mathematicians call 
an invariant measure. 

The group, now including Shandelle Henson at the University of 
Arizona, is also experimenting with the beetles’ food supply, alter- 
nating between 32 grams of flour at one feeding and only 8 grams 
the next (20 grams is the normal ration; the beetle bottles are 
cleaned and resupplied at each census). This experiment, 
Costantino explains, is a follow-up to earlier work 
by David Jillson in 1980, who reported a surprise: 


“The data show that this The feast-and-famine cycle leads to an overall 
laboratory system is in fact increase in population. 
chaotic.” Mathematically, the cycle can be described by 


alternating the cannibalism rates between high and 
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low values (in effect, beetles survive the lean weeks 

by snacking on one another). This complicates the 
analysis, but Cushing and Henson have found that the model agrees 
with and helps explain Jillson’s finding. The new experiments 
investigate how well the model describes details of the population 
dynamics. 

The researchers hope their results will convince others to look 
more seriously for chaos and other nonlinear effects in biological 
systems. Costantino stresses that their experiments, while done 
with real beetles (as opposed to purely mathematical computer sim- 
ulations), are still highly artificial. “Most ecologists and popula- 
tion biologists are very skeptical about whether this sort of thing 
occurs in the real world,” he notes. 

That skepticism will be allayed (or justified) only by more 
experiments. But better judgments will also require more mathe- 


matical research, Costantino thinks. In particular, he says, 
the analysis of chaotic systems needs to be coupled with the 
analysis of random, or “stochastic,” systems, in order to 
better account for the seemingly inevitable scatter of data 
(see Figure 5). 

At first blush, combining chaos and stochasticity seems 
contradictory: After all, one of the original selling points 
for chaos was that it could explain a lot of things that had 
been written off as random. But there’s actually nothing to 
prevent both chaos and randomness from acting—they 
could be two sides of one coin. 

Developing the mathematics of nonlinear stochastic 
dynamics won't be easy. Nonlinearity and nondeterminism 
are difficult enough individually; blending them seems dou- 
bly difficult. It’s not clear, for example, how to define cer- 
tain key concepts of chaos, such as bifurcation and 
Lyapunov exponent (a measure of a system’s sensitivity to 
initial conditions), when the dynamics are partly random. 

But the payoff should be well worth the effort. 
Randomness can enrich the already lively dynamics of 
chaos. Nonlinear systems, for example, often have multiple 
attractors and unstable “saddle” points which attract points 
from one direction but ultimately repel them in another 
direction. When the dynamics are deterministic, a given 
point may initially head toward a saddle, but eventually 
land in an attractor, where, by definition, it stays forever. If 
a little randomness is added, however, points can be 
knocked out of an attractor, wander around, perhaps revisit 
a saddle, and land in another attractor. Over time, a single 
point in a stochastic system can “explore” the whole range 
of the underlying dynamics. 

This interplay of stochasticity and nonlinear dynamics 
“seems to be a real one in terms of real ecological issues,” 
notes Costantino. The beetles still have a lot to tell, he says: 
“It’s really just getting exciting.” 
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Figure 5. A 3-d phase-space comparison of 
experimental data (a) to the output of a sto- 
chastic model (b) for the value cpa = 0.35. 
The dark loop in both plots is the chaotic 
attractor for the deterministic model. 
(Reprinted with permission from “Chaotic 
Dynamics in an Insect Population (Response 
to Perry et al,” R.F. Costantino, R.A. 
Desharnais, J.M. Cushing, and Brian Dennis, 
Science, volume 276, 20 June, 1997, pp. 
1881-1882. Copyright 1997 American 
Association for the Advancement of Science. 
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Add Hoc. A precursor to the modern desktop computer, the Burroughs adding machine was 
“hardware” incarnate. The tip of the handle is visible just above the keyboard. Will quantum 


computers make today's technology look equally quaint? (Figure courtesy of Unisys 
Corporation.) 
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From Wired to Weird 


n 1886, William Seward Burroughs and two partners formed the 

American Arithmometer Company, to market an extraordinary 
“Calculating Machine” which Burroughs had invented the previous 
year. The Burroughs machine was a mechanical device with a key- 
board and a printer. Designed to help bookkeepers, it was operat- 
ed by pulling a handle. The first design, however, had a minor flaw: 
The answers that inexperienced users got depended on how hard 
they yanked the handle. Burroughs’s solution was an oil-filled 
dashpot, which regulated the force transferred through the handle 
to the mechanism. By 1926, more than a million Burroughs adding 
machines had been sold. 

Today’s electronic computers are obviously an advance over 
their mechanical ancestors. Not only are these silicon marvels far 
faster, but their programmability makes them much 


more versatile than the limited-purpose adding 


machines they replaced. But are they the last word in When—or if—quantum 


rapid calculation? computation becomes a reality, 


Probably not. The computer of the twenty-first cen- computer scientists may need to 
tury might be as different from today’s calculating reconsider the idea of , 
prodigies as they are unlike the buggy-whip adders of computational complexity. 


yore. The crystal ball is still cloudy, but a number of 
researchers are focussing on an approach that could 
prove to be—literally—a quantum leap in computer technology: 
building machines capable of doing calculations of unprecedented 
size and scope by exploiting the weird rules of quantum mechanics. 

A “quantum computer,” as these yet-to-be-built machines are 
called, could tackle problems that no conceivable improvement on 
today’s electronic machine will ever solve, including problems 
whose present difficulty underlies many of today’s computer secu- 
rity systems. When—or if—quantum computation becomes a real- 
ity, computer scientists may need to reconsider the idea of compu- 
tational complexity. 

Mathematicians have played key roles in bringing quantum 
computation closer to reality. In 1994, Peter Shor at AT&T Labs 
(then AT&T Bell Labs) showed that factoring large numbers—the 
apparent difficulty of which is the basis for one of today’s most 
widely used computer cryptography systems—can, in principle, be 
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done quickly by a quantum computer. More recently, Shor and oth- 
ers have found ways to detect and correct errors that are likely to 
occur during quantum computation, a feat that many thought the 
nature of quantum mechanics made impossible. 

What is quantum computation? Let's start with a seemingly 
simpler question: What is traditional computation? 

Roughly speaking, a computation is a procedure that turns 
“input” into “output.” In today’s digital world, input, output, and 
everything in between consists of strings of 0’s and 1’s. In many 
computations, indeed, the output is a single 0 or 1: Yes or No, 
Black or White, Right or Left, Buy or Sell. Such computations 
solve what are called “decision problems,” since the output can be 
interpreted as a Yes or No decision. In other cases, the output can 
be viewed as the combined solutions of several separate decision 
problems: Is the first bit a 1? Is the second bit a 1? and so on. 

Traditional computation has two key features. First, computa- 
tion is “serial”: It proceeds one step at a time. Second, computa- 
tion is “deterministic”: If you run a program twice, you get the 
same answer each time. Nothing is left to chance. 

In practice, both of these features are often intentionally violat- 
ed. In many computations, some of the steps can be done in any 
order, so it’s often possible to do several steps simultaneously. This 
sort of “parallel” processing is becoming common. Such heavy- 
duty scientific computations as weather forecasting, which amount 
conceptually to “updating” values assigned to grid points based on 
the values assigned to neighboring grid points, are now usually 
done on parallel computers. Indeed, the main limitation of parallel 
processing seems to be the trouble computer scientists have in not 
thinking “serially” when programming in parallel. 

The deterministic feature of traditional computing may also be 
violated. Researchers have found good uses for programs that 
incorporate random bits (see “Random Algorithms Leave Little to 
Chance,” What's Happening in the Mathematical Science, Volume 
2). Computer security, for instance, relies on randomness: 
Passwords, for example, are pointless if they don’t have a random 
aspect. 

Parallel processing and random algorithms are used mainly to 
get around the bugbear of computational complexity: the fact that 
the amount of work done by a deterministic, serial computation 
tends to grow—often alarmingly—with the size of the input. How 
quickly or efficiently a computation can be carried out is a central 
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Tew iS ‘izier, the mathematician Sissa Ben ir, 

for inventing the game of chess. “Name your gift,” the king through all, el large 
said. Sissa Ben held up the chessboard. “Give me one grain fraction, of the possible 
of wheat for the first square, two for the second, four for the 0-1 combinations of 
third, and so on, doubling the number of grains for each of many bits. 

the 64 squares,” the mathematician humbly asked. id 

The king thought him a gentle fool. He could have asked 
for gold or jewels, but instead wanted just a handful of 
wheat, not even enough for a loaf of bread. King Shirham 
ordered his servants to go to the kitchen and bring back a 
sheaf or two of wheat. 

The rest of the court laughed at Sissa Ben, and kept up its 
chortling as the grains of wheat were laid out along the first 
row of the chessboard: 1, 2, 4, 8, 16, 32, 64, 128. Nor did 
the second row make much of a dent in the king’s pantry, 
although the grains would no longer fit within the actual 
squares of the board: 256, 512, 1024, 2048, 4096, 8192, 
16384, 32768. “You might make a cake after all,” the king 
observed with a smile. 

Filling out the third row suddenly had the king’s servants 
scuttling back to the kitchen: 65536, 131072, 262144, 
524288, 1048576, 2097152, 4194304, 8388608. By now 
they were dipping into their fourth bushel of wheat from the 
palace supplies. (A bushel of wheat, they found by counting, 
contains about five million grains.) 

The fourth row exhausted the king’s granary entirely (not 
to mention the poor servants, who did all the hauling and 
counting). The thirty-second square alone consumed more 
than 400 bushels of wheat. By this time the king’s counte- 
nance had run the gamut from amused, through bemused, to 
confused. “What's going on here?” he asked. 

“Your highness,” said Sissa Ben, “there’s not enough 
grain in your whole kingdom to satisfy my humble request. 
“The final square alone will call for nearly two trillion 
bushels of grain. That's as much wheat as your kingdom will 
produce in the next ten thousand years, if there are no 
droughts. Or perhaps, since he who owns the wheat is said 
to be in charge, I should say my kingdom.” 

Unfortunately, Sissa Ben had forgotten that King Shirham 
still commanded the court swordsman-executioner. Which 
only goes to show, the powers of 2 are still no match for the 
power of One. 


question of theoretical computer science. 

Real trouble occurs when a computation needs to search through 
all, or a large fraction, of the possible 0-1 combinations of many 
bits. Factoring large numbers is one example; the celebrated “trav- 
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Figure 1. “Wave-Particle Duality.” In quantum mechanics, light is both a wave and a particle. Philadelphia- 

based artist Mel Fisher has translated this principle into sculpture. (Photo courtesy of Mel Fisher and Will 

Brown, photographer.) 
eling salesman problem” is another. In the former case, each com- 
bination of, say N bits represents a possible factor of a 2N-bit 
number; in the latter case, the combinations of bits describe possi- 
ble itineraries that visit each of N cities in some order. In each 
case—and many more like them—the number of possibilities 
grows exponentially with the size of the input, soon outstripping 
any conventional computer's abilities (see Box, “Double, Double, 
Toil and Trouble,” previous page). 

Random algorithms sometimes reduce the “average cost” of a 
computational search down to a reasonable amount, and parallel 
computation speeds things up, just as a dozen combines can harvest 
a field more quickly than one. But random algorithms may not 
exist for every problem—none is known for factoring or for the 
traveling salesman problem—and the speed-up offered by parallel 
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computation is limited by the number of processors available: Add 
one more bit to the search, and, roughly speaking, you need either 
to double the number of processors or to wait twice as long for an 
answer. 

But what if your computer could somehow entertain all 2 com- 
binations of N bits simultaneously, in a single processor? 

Enter the quantum computer. 

The prospect of a single processor doing many things at once is 
bizarre and counterintuitive. But so is quantum mechanics, and 
that’s where the hope lies. Quantum mechanics, for all its weird- 
ness, is the theory that physicists believe is the “true” description of 
nature (see “A Prime Case of Chaos,” page 21). The rules of quan- 
tum mechanics sometimes lead to strange predictions, but none has 
ever been overturned by experiment. 

The essential quirkiness of quantum mechanics resides in the 
way it answers Hamlet's query, “To be or not to be?” with the sub- 
versive comeback, “Some of each.” In “classical” physics, every- 
thing is in a definite state: Each object has an exact position at each 
moment and moves with a precise amount of momentum, a spin- 
ning object has a precise axis of rotation, and electromagnetic 
waves—light—are polarized in particular directions. But in quan- 
tum mechanics, “reality” is a probabilistic mixture of classically 
definite states. An electron, for example, can be in two places at 
once, its “spin” (which is actually a purely quantum-mechanical 
property only loosely related to rotating objects) can be along two 
different axes at the same time, and its encounter with a photon— 
the quantum of electromagnetic radiation—can leave it in several 
energy levels at once. 

It gets stranger still. The probabilistic mixtures aren’t just ratios 
of different states, say a 50:50 mix of the ground and excited ener- 
gy levels for an electron. Instead, the combinations use complex 
numbers for the different states’ contributions. A typical quantum 
state is written as a|0) + 5|1), where |0) and |1) denote, say the 
electron’s ground and excited states, and a and b are complex num- 
bers with the property that |a|? + |b|? = 1. (The “|)” notation was 
introduced by the British physicist Paul Dirac, one of the pioneers 
of quantum theory.) 

The key to quantum mechanics is the probabilistic interpretation 
of the complex numbers a and b. When the energy level of the 
electron is measured, it is found either in the ground state or in the 
excited state—the system is said to “collapse” into one of the two 
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states. If the electron’s actual state (before the measurement) is 
a\0) + b|1), the collapsed state is either purely |0) or purely |1), 
with probabilities |a|? and {b|?, respectively. 

A sensible-sounding interpretation of what's going on is that the 
electron really was in the collapsed state all along. In other words, 
the physical process (such as a laser trained on the electron for a 
specific length of time) that put the electron in the state that physi- 
cists call a|0) + 6|1) really just put it in one of the two pure states 
{0) or |1), but did so at random with the respective probabilities 
{a\? and |bJ?. 

This interpretation sounds reasonable, but it simply doesn’t 
work: It leads to predictions that are directly contradicted by phys- 
ical experiment. 

One flaw in the “sensible” interpretation is that it disregards the 
fact that the complex numbers a and b can point in different direc- 
tions in the complex plane—in the vernacular of physics, they can 
be “out of phase.” The effects of phase differences are familiar in 
classical wave mechanics: the physics of light, sound, and water 
waves. Indeed, much of the mathematical apparatus that physicists 
use to study these familiar phenomena carries over into quantum 
mechanics. But quantum theory calls for more. For instance, there 
is no notion of, say a water wave “collapsing” just because some- 
one looks out from the shore. (Nor have philosophers ever postu- 
lated that just one person, selected at random, hears the proverbial 
tree fall in the forest. Maybe they should.) Quantum mechanics 
allows such paradoxical effects; quantum computation depends on 
them. 

So how does quantum computation work? 

It’s best described in an idealized setting—in part because no 
one has built more than bits and pieces of a real quantum comput- 
er. Imagine a string of N particles, each of which can be in a con- 
trolled mixture of two energy levels. (Researchers such as David 
Wineland and Chris Monroe at the National Institute for Standards 
and Technology have reported some initial success controlling 
energy levels in a short string of beryllium ions.) A typical 
observed state of the system can be written as |u; uo... uy), 
where each wu is either a 0 (ground state) or a 1 (excited state). But 
during the computation, the system can be in a mixture of all 2¥ 
combinations of these observable states. For example, a 2-particle 
computer can be “programmed” to start in the state 
3|00) + 5|01) + 3/10) + 5|11). (Note that the sum of the squares 


of the coefficients is 1: (5)? + (3)? +(3)?+(3)?=1. For the 
general state, a|00) +5|01) + c\10) + d|11), the requirement is 
Jal? + |b]? + lel? + |d? = 1.) 

Each state of an N-particle quantum computer can thus be 
viewed as a “unit vector” in a 2’-dimensional (complex) vector 
space. In order to exploit quantum mechanics, each step in a quan- 
tum computation must be simply a “rotation” of the vector space— 
a linear transformation that causes vectors to point in new direc- 
tions, but leaves their lengths unchanged. For example, the trans- 
formation that changes a\00)+b/01) +c\10)+d\11) into 
c|00) + d|01) + a|10) + 5|11) is a rotation; as a logical operation, 
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Figure 2. Each bit in a 3-bit “word” 
can be viewed as a pair of opposite 
faces of a cube (top). The picture gets 
a bit dicey in 4 dimensions (bottom). 
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with 1 and 0 denoting True and False, it can be said to “negate” the 
state of the first particle. 

Which brings up the subject of quantum bits, or “qubits,” as 
they’re called. In classical computing, a bit is just a pair of possi- 
bilities, a 1 or a 0 in a particular location. It can be thought of as a 
pair of opposite (V — 1)-dimensional faces of the N-dimensional 
hypercube (see Figure 2). A qubit, on the other hand, is a pair of 
perpendicular 2~!-dimensional subspaces of the 2-dimensional 
vector space. Measuring, or observing, a qubit causes a “collapse” 
into one of the two subspaces: If the state before the qubit 
is observed is aii + bi, where @ and @ are unit vectors in 
the qubit’s two 2‘~! 
\a|? + |b\? = 1, then the state actually observed is either 7 


-dimensional subspaces and 


or @, with probabilities |a|* and |b|?, respectively. 

In more physical terms, a qubit can be thought of as a 
pair of possible results of a quantum-mechanical measure- 
ment. What's important is that various measurements are 
possible in an N-particle system. Instead of measuring, 
say the energy levels of individual particles, one might 
instead measure whether two particles are in the same ener- 
gy level. In a 2-particle “computer,” the corresponding 
qubit is the pair of subspaces containing states of the form 
a\00) + b|11) (same energy level) and a|01) + b|10) (dif- 
ferent levels), respectively. 

The trick to quantum computing, then, is to specify a 
sequence of abstract rotations in this high-dimensional vec- 
tor space that will turn a known initial state into one whose 
qubits, once measured, correspond to a solution to a prob- 


Peter Shor. (Photograph courtesy of lem. That's easier said than done: The restriction to rota- 
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tions, the only transformations that quantum mechanics 

allows, makes designing quantum algorithms seem a daunt- 
ing task—like rock climbing with both hands tied behind your back. 
Indeed, it wasn’t clear until recently that quantum computers could 
solve any problem of practical interest. 

Then came Peter Shor’s factoring algorithm. 

The foundation for a theory of quantum complexity was already 
in place. In a paper published in 1993, Ethan Bernstein and Umesh 
Vazirani at the University of California at Berkeley, gave a precise 
formulation for a “Quantum Turing Machine” and constructed a 
problem that could be solved efficiently (in what's called quantum 
polynomial time) by such a machine, but not by a classical comput- 


er. Their example, however, was theoretical rather than practical; it 
was not a computation that anyone would care to run on an actual 
machine. 

Factoring, however, is a problem that people do care about— 
deeply. Number theorists have their own reasons for wanting to fac- 
tor large numbers quickly, and all mathematicians like the challenge 
of a seemingly difficult problem. But factoring has also become a 
problem of considerable socio-economic importance: The (present) 
difficulty of factoring large numbers underlies one 
of the most popular approaches to computer 
security, known as the RSA public-key 
cryptosystem (see “The Secret Life 
of Large Numbers,” What's 
Happening in the Mathematical 30 
Sciences, Volume 3). An effi- 
cient factoring algorithm 
could put countless comput- 
er systems at risk. 

In 1994, Shor showed 
that factoring the product of 
two N-bit numbers on a 
quantum computer is almost 
as easy as multiplying the 
two factors in the first place. 

More precisely, the number of 20 

steps required for factoring grows 

just slightly faster than N?. Shor's 

algorithm exploits a close connection 

between number theory and geometry: the 

fact that divisibility of numbers can be recast in 
terms of rotations of a vector. 

Suppose you want to know whether 5 divides 9. Start with a 
vector in the xy-plane, pointing in the x direction. Rotate it (coun- 
terclockwise) by nine-fifths of a full circle, i.e., by 27 x 9/5 radi- 
ans (see Figure 3). Now take the angle this vector makes with the 
x-axis and multiply it by 2, 3, 4, and 5, producing a total of five vec- 
tors. These five vectors are nicely balanced, pointing in five sym- 
metrically-spaced different directions. Therefore their sum, as vec- 
tors, is zero—if you think of the vectors as forces trying to pull the 
origin out of position, the forces exactly cancel out. The cancella- 
tion occurs because 5 does not divide 9. 


40 
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Figure 3. Geometric “proof” that 5 


does not divide 9. 
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On the other hand, if you’re checking whether 3 divides 9, the 
starting vector is rotated by nine-thirds of a full circle, and then this 
angle is multiplied by 2 and 3. In this case, all three vectors point 
in the same direction, namely along the positive x-axis, so their 
sum is non-zero. This geometric property holds in general: The 
vectors generated by non-divisors cancel each other out; the vectors 
generated by divisors don’t. 

Very roughly speaking, Shor’s algorithm uses this geometric 
property of numbers and the superposition principle of quantum 
mechanics to test all possible divisors simultaneously. (The details 
are a lot more complicated. Shor’s algorithm doesn’t actually pro- 
duce a factor, but a related number which can be efficiently con- 
verted to a factor by a classical, non-quantum computation.) 

Shor’s algorithm was immediately hailed as a “killer app”; an 
application that justifies the cost of building quantum computers. 
Whether that’s literally true remains to be seen. But if researchers 
discover quantum algorithms for other important problems, the 
pressure to develop the hardware will continue to mount. 

What's really holding back quantum computing right now is the 
difficulty of building physical devices that do—reliably—what the 
theory requires. Quantum computers need the equivalent of 
Burroughs’s dashpot. 

Several research groups are exploring various approaches (see 
Box, “A Quantum Cup of Computation,” page 94), but it will be 
years before a quantum machine does any significant computing. A 
realistic goal for now, researchers say, is a device that could factor 
15 into 3 x 5. That may sound silly, but then Alexander Graham 
Bell showed off his new invention by summoning his assistant from 
the next room. 

In one sense the hardware hangup is deja vu all over again: 
Early electronic computers were plagued by faulty equipment that 
limited their usefulness. Their key component, the trusty vacuum 
tube, might burn out at any moment. This problem was solved by 
a technological miracle: the transistor. A similar breakthrough 
could be in store for quantum computing. (Incidentally, rumors of 
the vacuum tube’s demise are highly exaggerated. It remains the 
technology of choice among audiophiles.) 

But in another sense, quantum computing presents difficulties 
that are more nouveau vu. Classical computing’s reliability is 
ensured by a variety of error-correcting techniques; algorithms that 
help the computer repair “damaged” bits (see “Straightening Out 


Nonlinear Codes,” What’s Happening in the Mathematical 
Sciences, Volume 2). The rules of quantum mechanics, however, 
appear to proscribe any such algorithm. Classical error-correction 
methods tacitly assume you can follow computations along, look- 
ing for trouble. But peeking inside a quantum computer spoils the 
superposition of states, ruining the parallel computation. 

This no-peeking-allowed property led many researchers to con- 
clude that quantum computing could never be practical. To carry 
out any computation of reasonable length, the components of a 
quantum computer would have to operate with an error rate many 
orders of magnitude below what designers now expect of integrat- 
ed circuits. Even a transistor-like breakthrough wouldn’t help 
much, 

But a second mathematical breakthrough by Shor has given 
researchers hope that quantum computers may indeed someday 
compete with classical technology. In 1995, Shor showed that 
errors in quantum computing can be corrected. Shor’s result 
opened a floodgate of methods, so that now almost every classical 
error-correction algorithm has a quantum analog. 

Shor’s idea is to look not at qubits that contain the information 
you’re computing with—that’s still verboten—but at extra qubits 
that contain information about the information that you're striving 
to preserve. The trick is to set things up so that when you peek at 
the extra qubits, all you learn is whether or not your observation 
has just eliminated an error that had crept into the system. 

Shor’s algorithm shelters one qubit of information in a cushion 
of nine qubits. (By contrast, the simplest classical method uses 
three bits to protect one: do everything in triplicate, and use a 
“majority vote” rule to decide the case if one bit goes awry.) In 
effect, the error-correction algorithm decides which, if any, of the 
nine qubits has strayed, and stores that information as some extra 
qubits in such a way that, when those qubits are measured, the orig- 
inal nine are nudged back into their correct form. The procedure 
doesn’t violate the rules of quantum mechanics because the infor- 
mation gained by the measurement contains no clue to the actual 
state of the sheltered qubit, which stays in its original superposi- 
tion. 

Using nine qubits to protect one qubit worth of information is 
not very practical (but then, nothing about quantum computation is 
very practical yet). But improvements on Shor’s scheme followed 
quickly. Researchers at IBM and Los Alamos National Laboratory, 
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for example, found a way to shelter a qubit with just five qubits. 
More dramatically, Shor and Rob Calderbank, also at AT&T Labs, 
and, independently, Andrew Steane at Oxford University, found a 
family of classical error-correcting codes that could be converted 
into ones that work on a quantum computer. Soon after, Daniel 
Gottesman, then a graduate student in physics at Caltech (now at 
Los Alamos National Laboratory), developed a mathematical for- 
malism for quantum error-correcting codes, further expanding the 
family of options. Shor, Calderbank, and their colleagues Eric 
Rains and Neil Sloane also found a much larger family of classi- 
cally based codes. 

The new schemes for quantum error correction are based on 
close connections between error-correcting codes and linear alge- 
bra. Linear codes consist of vectors of 0's and 1's in a high-dimen- 
sional vector subspace. In effect, the quantum codes take advan- 
tage of the superposition principle in quantum mechanics to do the 
linear algebra that forces any errant codeword back into the sub- 
space from which it may have strayed. 

Many more obstacles, both technological and mathematical, 
must be overcome before quantum computing comes anywhere 
near today’s (not to mention tomorrow’s) electronic computers. 
But fifty or a hundred years from now, people may marvel at the 
patience of their predecessors, who made do with machines that did 
mere millions of computations per second. 


A quantum computer 
with as few as 30 
qubits could begin to 
rival conventional 
machines. 
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Tales from the Cryptosystem 


nleashed by the Internet, electronic commerce is spreading 

like the plague. But even as they rake in the e-dough, savvy 
cyber-capitalists can’t help fretting: How safe are the transactions 
that zip—or zigzag—through cyberspace? 

Secure communication systems based on principles of “public- 
key cryptography” have become popular on the Web. Public-key 
cryptography relies, in turn, on the existence of mathematical prob- 
lems that are exceedingly difficult to solve. But even the best of 
these systems leave open the possibility, however remote, that a 
persistent, clever hacker might discover the key that unlocks them. 
However, a recent mathematical discovery points the way toward 
public-key cryptosystems that can be broken only by someone who 
is extraordinarily clever—or awfully lucky indeed. 

Miklos Ajtai, a mathematician at IBM’s Almaden Research 
Center in San Jose, California, has shown that randomly generated 
instances of a mathematical problem known as the shortest-vector 
problem are extremely likely to be extraordinarily difficult to solve. 
More recently, Ajtai and Cynthia Dwork, also at Almaden, have 
developed a public-key cryptosystem that relies on the shortest- 
vector problem for its security. 

The security of public-key cryptosystems is based on a mathe- 
matical conjecture, which can be expressed as an inequality: 
P < NP. Roughly speaking, P is the set of mathematical prob- 
lems that can be solved in a reasonable amount of time by a com- 
puter; NP is the set of problems for which a solution can be veri- 
fied in a reasonable amount of time, but which may require an 
unreasonable amount of time to find a solution. The conjectured 
inequality asserts that there are N P problems that really cannot be 
solved in a reasonable amount of time. 

The simplest example of an N P problem is the factorization of 
large numbers. It’s quite easy, for instance, to verify that 4259 
x 2879 is a factorization of 12,261,661—you can do it on a pock- 
et calculator—but finding the factors given only the number is a 
different story. (To see for yourself, try factoring 12,200,467 on a 
calculator.) The best known factoring algorithms require an 
amount of calculation that grows exponentially with the number of 
digits in the number to be factored. These algorithms are all but 
useless for numbers with around 200 digits. 


———— 


Miklos Ajtai. (Photo courtesy of IBM 
Almaden Research Center.) 


Cynthia Dwork. (Photo courtesy 
of IBM Almaden Research Center.) 


———— 
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(3,4) 


(0,0) 


Figure 1. Vector addition. The sum of two vectors is the point that 
provides the fourth vertex of a parallelogram. 
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The difficulty of factoring large numbers is the basis for one of 
the original public-key cryptosystems, known as the RSA code (see 
“The Secret Life of Large Numbers,” What's Happening in the 
Mathematical Sciences, Volume 3). The RSA system codes mes- 
sages using the product of two large primes, but decoding requires 
knowing the prime factors themselves. 

Not every product pq of two primes is hard to factor, however. 
Some algorithms make quick work of products if the numbers 
p—1 and q — 1 both happen to factor into lots and lots of “small” 
primes, or if they have a large factor in common. These particular 
pitfalls are easy enough to avoid, but the question lingers: Could 
there be other algorithms, yet unknown, that work some fraction of 
the time, and thereby compromise factorization-based security? 

The same question arises for every problem not in P: Even if 
no general algorithm efficiently solves ail instances of the problem, 
that doesn’t mean there can’t be algorithms that handle a significant 
fraction of random instances. A hacker in possession of such an 
algorithm could troll the Internet, fishing for cases where the algo- 
rithm cracks the code. Even if an algo- 
rithm works only occasionally, the huge 
volume of Internet traffic and the value of 
the transactions may make the fishing 
expedition worthwhile. 

But now there’s a guarantee, of sorts, 
for at least one NP problem. 

The problem itself can be stated suc- 
cinctly: Retrieve a short vector after it’s 
been hidden by a basis of behemoths. To 
understand what this means, however, 
requires a brief introduction to linear alge- 
bra. 

Geometrically, a vector is just a point 
in space. But “space” may mean a 2- 
dimensional plane, ordinary 3-dimension- 
al space, or “hyper-space” of some higher 
dimension n. (Vector spaces can even be 
infinite-dimensional, but those don’t 
appear in cryptographic applications— 
yet.) What matters is that, once one point 
has been picked as the “zero” vector, pairs 
of vectors can be added: The sum of two 
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vectors is the fourth vertex in the parallelogram whose other three 
vertices are the zero vector and the two vectors being summed (see 
Figure 1). A vector can also be multiplied by a real number. The 
result is a point in line with the vector and the zero vector, and 
whose distance from the zero vector has been scaled by the size of 
the real number. (If the number is negative, the zero vector lies 
between the vector and its multiple.) 

Algebraically, vectors can be represented as ordered pairs, 
triples, or “n-tuples” of numbers—for example, the points labeled 
(2,1), (1,3), and (3,4) in Figure 1. The sum of two vectors is 
obtained simply by adding the corresponding numbers in the n- 
tuples: Thus (2,1) + (1,3) =(2+1,1+3) = (3,4). The zero 
vector, of course, is simply an n-tuple of 0’s. Multiplying a vector 
by a real number is equally easy: Just multiply each component 
number of the n-tuple, so that 2 - (2,3) = (4,6). 

The vectors whose n-tuples consist entirely of integers com- 
prise what’s called a lattice. More generally, a lattice is any set of 
regularly spaced points that satisfies the “parallelogram property.” 
If the zero vector belongs to the lattice, this means that the sum of 
any two lattice vectors is again in the lattice. Such a lattice can be 
defined as the integer combinations of a 
set of “basis” vectors, whose compo- 
nents themselves are integers. 

The obvious choice is to describe a 
lattice in terms of basis vectors that are 
as “short” as possible, i.e., as close to the 
zero vector as possible. In the plane, for 
example, a sensible basis for the full lat- 
tice of all integer vectors is simply (1,0) 
and (0,1)—the “unit” vectors in the 
“a” and “y” directions. 

But there’s no law that says you must 
use short vectors for your basis—and 
there may be good cryptographic rea- 
sons not to. 

It’s relatively easy to create a lattice 
which has just one “short” basis vector 
(see Figure 2). It is then possible to 


“hide” the short vector by combining itt —_$—_$_$_— ————————— 
with one or more of the “long” basis Figure 2. Basis vectors. Any two of the three indicated points can serve 
as a basis for the lattice of integer points indicated by solid circles 


vectors. For example, the vectors (3,8) 
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Figure 3. Hidden hyperplanes are key 
to a new cryptosystem. 
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and (1,8) are as good a basis for the lattice in Figure 2 as (2,0) 
and (1.8). 

When the dimension of the space is small, it’s relatively easy to 
compute the shortest vector in a lattice given any basis for it—just 
as it’s easy to factor a number with only a few digits. But the 
amount of calculation increases exponentially with the dimension, 
much as the difficulty of factorization grows with the number of 
digits. Finding short vectors is an NP problem, one that 
researchers believe really isn’t in P. But how sure can you be that 
a randomly generated example won't 
succumb to some slick algorithm that 
only works some of the time? 

Pretty darn sure, as it happens. 

The first breakthrough came in 
1996, when Ajtai showed that the 
shortest-vector problem and several 
variants carry a strong assurance of 
difficulty. He did not show that these 
problems are necessarily hard to 
solve—that would amount to a proof 
of the P < NP conjecture—but he 
did the next best thing. He showed 
that any algorithm that can efficiently 
solve any finite fraction of shortest- 
vector problems—even a probabilis- 
tic algorithm that succeeds only some 
of the time on some of the prob- 
lems—could be converted into an 
algorithm that efficiently solves all 
shortest-vector problems all the time. 
In other words, any hacker who 
cracks a code based on a shortest-vector problem will, in effect, 
have proved that the shortest-vector problem belongs in the class 
Pp, 

Ajtai’s proof was hailed as a breakthrough in theoretical com- 
puter science. “It’s been sort of a Holy Grail of cryptography, to do 
what he has done, for any sort of problem whatsoever,” says Ron 
Fagin, a colleague at IBM. Shafi Goldwasser, an expert in com- 
plexity theory at the Massachusetts Institute of Technology, says 
simply, “It’s a wonderful result.” 

But does the result lend itself to cryptographic applications? At 
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the time, no one knew a good way to use the shortest-vector prob- 
lem to concoct a secret code. 

Enter Dwork. She and Ajtai worked together to turn Ajtai’s the- 
oretical result into a public-key cryptosystem. In fact, they have 
found three separate cryptosystems, each based on the shortest-vec- 
tor problem. For each one, a random key is as hard to crack as the 
hardest key. 

The key, as it were, to Dwork and Ajtai’s approach is a “dual” to 
the shortest-vector problem, in which lattices are constructed that 
have a basis with just one “long” vector. This basis is kept secret; 
a second basis, in which all the vectors are long, is then computed 
using a random algorithm, and made public. In effect, the public 
key describes a lattice consisting of widely separated, “hidden 
hyperplanes” (see Figure 3). Only the person (or computer) in pos- 
session of the private key knows the orientation of the hidden 
hyperplanes and how far apart they are. 

Encryption is done one bit at a time. To encrypt, say a 0, the 
sender picks a random vector in the lattice, and then moves a short 
distance away in a random direction. To encrypt a I, on the other 
hand, the sender simply picks a random vector, which, with exceed- 
ingly high probability, will be a considerable distance from any 
point in the lattice (the chance of it actually being close to the lat- 
tice is akin to the chance of drawing to an inside straight a hundred 
times in a row). At the decoding end, the receiver does a simple 
calculation to determine how close the transmitted vector is to the 
nearest hyperplane. If it’s close, the original bit was almost cer- 
tainly a 0; otherwise it was a 1. 

As they currently stand, the Ajtai-Dwork cryptosystems are not 
practical: Turning each bit of a message into a high-dimensional 
vector with huge integer entries is too cumbersome to implement, 
unless the message is extremely short (say a simple Yes/No reply to 
a question). But the IBM researchers hope to refine their tech- 
niques and extend the analysis to obtain marketable codes. 

There’s always the worry, of course, that the conjecture 
P < NP is simply wrong. In that case, much of the mathemati- 
cal underpinning of public-key cryptography evaporates. But the 
“smart” money says the conjecture is true, and that it might even be 
proved someday. If someone does show it’s false, however, 
chances are he or she would rather have credit for the discovery 
than any ill-gotten gains from eavesdropping on the Internet. At 
least that’s the hope. 


— 


At the decoding end, 
the receiver does a 

simple calculation to 
determine how close 


the transmitted vector 


is to the nearest 
hyperplane. 
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The Bride Stripped Bare by Her Bachelors, Even (Large Glass, 1915-1923). Was Marcel 
Duchamp driven by an understanding of mathematics? (Photo courtesy of the Philadelphia 
Museum of Art.) 
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But Is It Math? 


lo many people, art and mathematics seem as far apart as the 
Bee air and a prison cell. One is the product of free imagina- 
tion, unlimited by the constraints of reality. The other is burdened 
by the need for careful calculation, and forced to follow lines of 
logic. 

But which is which? Is art the taproot or the rosebud? Is math- 
ematics the earthworm or the butterfly? Or are they inseparable 
twins, the two sides of a coin? Or could they be closer still: the 
front and back of a Mobius strip? 

Doris Schattschneider takes the Mébius view. Schattschneider, 
a professor of mathematics at Moravian College in Bethlehem, 
Pennsylvania, is an expert on one of the most popular artists of the 
twentieth century: M.C. Escher, the 
Dutch graphic artist whose playful 
geometric designs abound on posters 
and tee-shirts. 

Rhonda Roland Shearer, an artist 
based in New York, also sees an inti- 
mate intertwining of art and mathe- 
matics. But she highlights a different 
artist: Marcel Duchamp, who is 
widely regarded as the founder of 
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Escher’s work looks classical, like the 
product of a Renaissance master. 
Duchamp, on the other hand, is 
unmistakably an artist of the twenti- 
eth century. But both artists drew 
inspiration from mathematical ideas, 
and mathematics can help viewers 
understand their art. 

Maurits Cornelius Escher was 
born in 1898 and died in 1972. He 


Figure 1. “Angels and Devils” (1941, © 1998 Cordon Art B. V., Baarn- 
Holland. All rights reserved.) M. C. Escher took delight in filling the 
plane with unusual combinations of creatures. 
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worked primarily in woodcuts and lithographs, and 
many of his pictures are directly inspired by geometric 
ideas and notions of mathematical symmetry. He is 
best known for his tessellations—intricately interlock- 
ing figures in the form of birds, fish, lizards, and other 
animate shapes, which can be extended to fill the entire 
plane (see Figure 1). 

Escher’s interest in tessellation dates back to his 
earliest work, but it became a central theme for him 
after a trip to the Alhambra in 1936. The elaborate pat- 
terns in the decorative tilings and brickwork he saw 
there ignited his passion for geometric designs. Escher 
began an intense study of symmetry, one that would 
last the rest of his life. On the one hand, he was trying 
to find animal shapes that would fit together as “tiles”; 
on the other hand, he wanted to understand all the dif- 
ferent ways such shapes could tile the plane. 

A crucial event was Escher’s discovery, in 1937, of 
a paper by the Hungarian mathematician George Pélya 
(1887-1985). The paper, titled Uber die Analogie der 


Figure 2. George Pélya’s 1924 illustration depicting Kristallsymmetrie in der Ebene (“On the analogy of 
the 17 different symmetry groups inspired Escher to crystal symmetry in the plane”), had appeared in 1924 
undertake his own systematic study of planar symme- in Zeitschrift fiir Kristallographie (Journal of 


tries. 
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Crystallography). In it, Pélya described the mathemat- 

ical classification of 2-dimensional symmetry groups; 
in particular, he explained the remarkable fact that, despite the 
seemingly infinite variety of 2-dimensional repeating patterns, 
there are really just 17 different types if they are classified accord- 
ing to their symmetries. Pélya gave an example to illustrate each 
symmetry group (see Figure 2). 

Over the next four years, Escher developed his own “layman’s 
theory” of planar symmetries. He summarized his findings in a 
large notebook, which he labeled Regelmatige vlakverdeling in 
asymmetrische congruente veelhoeken (“Regular division of the 
plane with asymmetric congruent polygons”). This notebook 
became the source for much of his subsequent work. 

The asymmetry of the polygons was crucial for Escher: It 
enabled him to transform purely geometric shapes into the lively 
creatures for which he is known. Indeed, Schattschneider stresses, 
Escher never produced a finished work that is pure tessellation; his 


prints all have human or natural elements. 
“Symmetry was a device for him, and that’s all it 
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was,” Schattschneider says. “It was an intellectual q q 4 4 
challenge for him to figure out what kinds of things he 4 ‘ q 
could do, but ultimately he was going to use them [the Lae ms zi < a a 7 ": 
symmetry drawings].” 7 fe | q 
Escher would sometimes impose a constraint, and *" ba Ses £ af 
then see how much design freedom remained. Some of e a ae 
his self-imposed rules lead to complicated counting tet > < 
problems. For example, if you start with a square See Gea  Gine 
design that looks different in each of its four orienta- 7 ne 4 
tions (see Figure 3), one constraint is to require that > < ley 5. 2 < 
four copies of the design or its rotational counterparts _— ven 
be fitted together in a larger, 2 x 2 square, which is a aoe “aH 
then used to tile the plane by translations in the hori- 2 fe) 4 Lo] Tsk 
zontal and vertical directions. The remaining freedom ra eS a = a a e 
is to decide on the orientations of the four squares in wl / % 
the 2 x 2 tile. Offhand, there are 256 possible combi- mt fess | Sl 
nations But two versions of the final (infinite) pattern v ve vypN vy 
are considered “the same” if one can be laid exactly on > i< ae: > < 
top of the other, possibly after a rotation. So the ques- Th, ~ SHAT Tw 
tion is, How many different patterns are possible from 
a given design? Figure 3. The four orientations of a square design can 


The answer turns out to be 23 (see Figure 3). Escher be fit together in a 2x 2 arrangement in numerous 
found the answer by methodically creating all patterns ways. If the 2 x 2 squares are used to tile the plane, 


with the original design in the upper left corner (there 
are 64 in all, corresponding to 4 choices for each of the 
other three squares in the 2 x 2 design), then cross-referencing the 
repetitions. 

The problem becomes harder when mirror images of the origi- 
nal design are allowed: Now there are 512 patterns with the origi- 
nal design in the upper left corner, and the human eye is hard 
pressed to tell when one pattern is merely, say a reflected and rotat- 
ed version of another. Computers are harder to fool, though, and a 
whole lot quicker. In an article in The Electronic Journal of 
Combinatorics (1997), Daniel Davis, a mathematician at 
Kingsborough Community College in New York, reports that there 
are 154 such patterns. 

The problem becomes even harder, Schattschneider notes, if you 
make it 3-dimensional: Imagine a plexiglas cube with a design 


there are 23 different patterns possible. 
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Figure 4. A tiling of the hyperbolic 
plane. (Figure from Introduction to 
Geometry, H. S. M. Coxeter. Reprinted 
by permission of John Wiley & Sons, 
Inc.) 
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inside that looks different in each of its 24 orientations (each of its 
6 faces can be on top, with 4 rotational positions for each) and each 
of its 24 mirror-imaged orientations; fit eight of these together in a 
2 x 2 x 2 cube, and then tile space with the larger cube. 

In 1958, he 
experienced another mathematical epiphany, when he saw a geo- 


Escher’s love of tilings went beyond the plane. 


metric diagram of curved triangles inside a circle (see Figure 4), 
The figure was in an article on “Crystal Symmetry and Its 
Generalizations,” by H.S.M. Coxeter, a famous geometer at the 
University of Toronto. It illustrated a tiling of what's 

known as the “hyperbolic” plane, a non-euclid- 
ean surface in which each point lies on infi- 
nitely many lines that are parallel to a 
given line. Coxeter’s diagram solved 

a problem Escher had been strug- 
gling with: How to get a progres- 
sion of interlocking figures to 
retain their basic shape while 
diminishing in size as they 
approach the boundary of an 

enclosed region. He produced a 

series of woodcuts based on 
hyperbolic tilings (see Figure 5). 
Escher’s work is tremendously 
popular with mathematicians and com- 
puter scientists. The centennial of his 
birth was celebrated in 1998, at a weeklong 
conference held in Rome (where Escher lived for a 
decade) and Ravello (a small town on the Amalfi coast in Italy, 
where Escher often vacationed). His centennial was also marked 
by special exhibitions at numerous art museums, including the 
National Gallery of Art in Washington, D.C. 

Duchamp, by constrast, is not well known as a mathematician’s 
artist. But his work also poses problems, though of a more obscure 
nature. 

Marcel Duchamp was born in 1887 and died in 1968. His mas- 
terpiece, known as the Large Glass (the formal title is The Bride 
Stripped Bare by Her Bachelors, Even), was completed in 1923. 
But he is best known for his outrageous “readymades”—apparent- 
ly everyday objects such as hatracks and snowshovels, which the 
artist turned into works of art by the act of selecting them. 


Duchamp’s most famous readymade, wryly titled Fountain, was a 
porcelain urinal. It caused an artistic uproar when he submitted it 
to an exhibition in 1917. 

Surprisingly, says Shearer, Duchamp’s inspiration for ready- 
mades—and their connection with the Large Glass—can be traced 
to an essay by the mathematician Henri Poincaré (1854-1912). 

Poincaré wrote several popular books on mathematics and sci- 
ence. In Science and Method, published in 1908, he speculated on 
the nature of creativity in mathematics. (Poincaré’s essay on this 
subject appears as the final article in this volume of 
What’s Happening in the Mathematical 
Sciences.) According to Poincaré, the cre- 
ative process is largely unconscious. 
Ideas flit about in the cerebral cel- 
lars,“like the gaseous molecules in 
the kinetic theory of gases,” he 
writes. They form endless combi- 
nations as they collide, but only 
the “fruitful” combinations ever 
bubble to the surface. In describ- 
ing the kind of mathematics the 
unconscious produces, Poincaré 
stressed the phrase “ready-made.” 

Duchamp’s familiarity with 
Poincaré, and with other mathemati- 
cians, has been documented by scholars 
such as Craig Adcock at the University of 
Towa and Linda Henderson at the University of Texas 
at Austin. Duchamp and many other artists, including Picasso and 
the Czech artist Frantisek Kupka, were inated by the discover- 
ies mathematicians had made in the nineteenth century about 4- 
dimensional and non-euclidean geometry. According to Adcock, 
the idea of projecting geometric figures from nm dimensions to 
n—1 dimensions is central to Duchamp’s work. “Marcel 


Duchamp was not a mathematician, but his uses of geometry were 
sophisticated,” he writes in a study of the Large Glass. 

“There’s just no doubt about it,’ adds Michael Taylor, a 
Duchamp scholar at the Philadelphia Museum of Art, where the 
Large Glass now resides. “He was very interested in mathematics 
right from the start.” For Duchamp, Taylor explains, mathematics 
“became a way out of convention and the tradition of art. And the 


Figure 5. “Circle Limit IV” (1960, 
© 1998 Cordon Art B. V., Baarn- 
Holland. All rights reserved.) Inspired 
by Coxeter, Escher switched from 
euclidean to hyperbolic tessellations. 
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The Large Glass is 
exactly what the name 
implies: a large piece 
of glass. 
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Large Glass is really the great manifestation of that.” 

Shearer goes further. The Large Glass, she says, is Duchamp’s 
artistic rendering of Poincaré’s theory of creativity—and his ready- 
mades are products of what she calls the “Poincaré/Duchamp cre- 
ativity machine.” 

Duchamp himself referred to the Large Glass as an “agricultur- 
al machine,” and the notes he compiled during the eleven years he 
spent creating it, which he collected in a companion work titled the 
Green Box (that’s literally what he put the notes in), contain numer- 
ous references to mathematics, mechanical devices, and the pro- 
duction of readymades. Shearer says that recognizing the parallels 
between Duchamp’s masterpiece and Poincaré’s essay will help 
scholars unravel some of the enigmatic artist’s mysteries. 

Like the Green Box, the Large Glass is exactly what the name 
implies: a large piece of glass. Actually, there are two pieces, fixed 
in a 6 x 9-foot frame. The lower, “bachelor” half includes several 
exquisitely rendered drawings of mechanisms, including a massive, 
three-wheeled chocolate grinder standing improbably on delicate, 
Louis XV-style legs. The chocolate grinder is linked to a watermill 
inside a wire-like “glider” by a semicircular arc of cones. A set of 
nine “malic moulds”—chess-like pieces which are named in the 
Green Box with occupational titles like policeman and undertak- 
er—hangs over the glider like a mobile, while a group of foreshort- 
ened circular designs float in space to the right of the chocolate 
grinder. 

In contrast to the mechanical, workmanly bachelor half, the 
upper, “bride” portion of the Large Glass is jumbled and cloudy. 
The most prominent feature is a cloud with three roughly rectangu- 
lar chambers, which Duchamp referred to as “draft pistons.” 
Dangling down (or rising up) on the left is a cascade of spare parts 
that could have come from a rust-belt junkyard. But the random- 
ness of the assemblage is deceptive: The cascade is meticulously 
copied from an earlier painting titled simply The Bride—and even 
there the parts are borrowed from other, still earlier works. 

According to Shearer, the Large Glass is Duchamp’s creativity 
machine, an embodiment of Poincaré’s description of the creative 
process. Poincaré himself remarks on the “delicate sieve” that per- 
mits only promising ideas to reach the surface, and the “sudden illu- 
mination” that results when they get there. In the Green Box, 
Duchamp refers to the are of cones as “sieves,” and repeatedly 


writes about “the illuminating gas." He opens the Green Box 
with 


“Given 1. the waterfall 2. the illuminating gas, we shall determine 
the conditions for the instantaneous State of Rest of a succession of 
various facts seeming to necessitate each other under various laws, 
in order to isolate the sign of accordance between, on the one hand, 
this State of Rest (capable of innumerable eccentricities) and, on 
the other, a choice of Possibilities authorized by these laws and also 
determining them.” 


(The quote is, of course, translated, but also slightly edited—the 
version in the Green Box includes handwritten insertions, underlin- 
ing, and circled words and phrases.) 

The readymades, Shearer concludes, are Duchamp’s artistic 
interpretation of Poincaré’s assertion that “discovery is selection.” 
In an article on the Duchamp-Poincaré connection, published in 
The Sciences in 1997, Shearer writes “The readymades were not 
just unexpected individual art objects; they were parts of an exper- 
imental game re-enacting the larger and more general creative 
process.” 

Shearer predicts that understanding the Large Glass as a 
Poincaréan creativity engine will enable students of Duchamp’s 
work to solve some of the other “problems” of interpretation that 
Duchamp left in his wake. She herself has been taking a closer look 
at the readymades, many of which survive only in photographs and 
imprecise reproductions. When the pictures are carefully analyzed, 
Shearer says, some of these supposedly mass-produced items are 
revealed to involve implausible features. The handle of Duchamp’s 
snowshovel, for example, has a square cross-section, rather than the 
common, more sensible, circular shape—which could, says 
Shearer, help explain Duchamp’s title for it, In Advance of the 
Broken Arm. 

There is more, of course, to Duchamp’s work, especially the 
Large Glass, than a simple encoding of Poincaré’s ideas. Escher’s 
work, too, can be appreciated on many levels, and for reasons apart 
from its exploration of mathematical ideas. The puzzles each artist 
presents are largely in the eye, and mind, of the beholder. As 
Duchamp liked to say, “There is no solution, because there is no 
problem.” 


“The readymades were 


not just unexpected 
individual art objects; 
they were parts of an 
experimental game 
re-enacting the larger 
and more general 
creative process.” 
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Marjorie Senechal. (Photo courtesy 
of Marjorie Senechal.) 
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Crystal Quest 


Escher’s investigations of symmetry paralleled—and in 
some ways outpaced—the research of crystallographers in 
the 1940’s and 50’s. He was invited to give an address at a 
meeting of the International Union of Crystallographers 
(IUC) in 1960. In it, he described his “layman’s theory” of 


regular divisions of the plane. 

Curiously, modern crystallography is leaning more and 
more toward irregular divisons of the plane. The IUC has 
adopted a new mathematical definition of the subject’s key 
word: crystal. 

Marjorie Senechal, a mathematician at Smith College in 
North Hampton, Massachusetts, who served on the IUC 
committee that proposed the new definition, explains that the 
notion of what a crystal is has changed many times over the 
years. At one time, scientists even believed that crystals 
were analogous to plant and animal life—after all, crystals 
grow, and often exhibit delicate features like those of feath- 
ers and veins. But by the end of the nineteenth century, it 
was generally accepted that crystals are regular, 3-dimen- 
sional arrangements of atoms or molecules. Table salt, for 
example, is a neat, cubical array of sodium and chlorine 
atoms, while ordinary ice forms a hexagonal array of water 
molecules. 

The idea of a crystal as a regular array of pointlike objects 
is very appealing. For one thing, it explains the beautiful 
symmetries that characterize common crystals. It can be 
proved mathematically (and was, by Euclid) that the only 
possible rotational symmetries in a regular array are by mul- 
tiples of 60, 90, or 120 degrees. In particular, five-fold sym- 
metry is ruled out (see Figure 6). The occasional pentagon 
that’s seen in things like iron pyrite (fool’s gold) is not a true, 
regular pentagon—it’s just an accident of the way the facet 
is cut. 

This neat theory of crystals was thrown for a loop in 1984, 
when Dany Shechtman, a materials scientist at the Technion 
in Israel and the U. S. National Bureau of Standards (now the 
National Institute of Standards and Technology), found evi- 
dence of crystals that do possess the forbidden, five-fold 
symmetry. 

The evidence came in the form of X-ray diffraction pat- 
terns from an alloy of aluminum and manganese. Roughly 
speaking, X-ray diffraction patterns provide a direct mea- 
surement of a crystal’s symmetries. Mathematically, a dif- 
fraction pattern is related to what's called a Fourier trans- 
form. When the (mathematical) object being transformed is 
a regular array of discrete points, the Fourier transform also 
appears as a regular array of discrete points, which show up 
as bright spots in an X-ray. A “correct” diffraction pattern 
for a crystal should have three-, four-, or six-fold symmetry, 
not the five-fold symmetry that Shechtman saw. 

At first, Shechtman didn’t believe his eyes, and he spent 


the better part of the two years trying to figure out what he’d 
done wrong. But he couldn’t find any flaws in his experi- 
ments or his equipment. Finally, in a joint paper with Ian 
Blech, Denis Gratias, and John W. Cahn, Shechtman 
announced the discovery of what are now called quasicrys- 
tals. 

Shechtman’s stunning discovery doesn’t invalidate the old 
point of view: Table salt, ice, and other common crystals are 
still the regular arrangements they've always been. But qua- 
sicrystals have caused crystallographers to rethink—and 
broaden—the scope of their discipline. 

The new definition, Senechal explains, is based on the 
crystallographer’s primary tool: X-ray diffraction. A crystal, 
by this definition, is any material whose X-ray diffraction 
pattern includes sharp, bright spots. 

A bright spot corresponds to what mathematicians know 
as a “delta function.” Under the old definition, the delta 
functions themselves are in a regular array. With the new 
definition, that’s no longer required. 

Lots of mathematical functions have Fourier transforms 
that fit the new definition of crystal. For instance, a class of 
geometric patterns known as Penrose tilings provide point- 


set functions whose transforms have exactly the same kind of 
five-fold symmetry as Shechtman’s quasicrystals (see Figure 
as 


Figure 6. Five-fold symmetry is impossible in a repeating planar pattern: 
Any center of five-fold symmetry would lead to an infinite nesting of pen- 
tagons. The upshot would be a set of points that are dense in the plane. 


—— 
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The search is still on for an understanding of these new 
materials. (“Real quasicrystals are not made up of Penrose 


tiles,” Senechal cautions.) Exploring the implications of the 
new definition is another exciting quest. The interplay of 
mathematics and crystallography is far from finished. 
Maybe we need another Escher. 


Figure 7. A triangular piece of a 
Penrose tiling (top) and its Fourier 
transform (bottom). (Figure courtesy 
of Marjorie Senechal and Cambridge 
University Press, from Quasicrystals 
and Geometry, Figure 8.5, 1995.) 
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Figure 8. Straight Lines in Four 
Directions & All Their Possible 
Combinations (1973). LeWitt’s origi- 
nal etching does not include the 
“empty” square in the lower right- 
hand corner. (Reproduced with the 
permission of Parasol Press, Inc.) 
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Henri Poincaré. (Photograph courtesy of the Mittag-Leffler Institute.) 
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Mathematical Discovery 


(by Henri Poincaré) 


The twentieth century will be remembered for marking the first 
scientific studies of the human mind. From Freud’s early excava- 
tions of the subconscious, to the mid-century exuberance over 
analogies with computing machines, to the most recent PET-scan 
studies and the popularity of Prozac, science has chipped away at 
what many consider the last bastion of philosophy. The fledgling 
investigations now underway may prove prescient, or they may 
ultimately land on the same scientific scrap heap as phlogiston and 
the ether. Either way, our thinking about thinking has permanent- 
ly changed. 

In the following essay, first published in 1908, the French math- 
ematician Henri Poincaré speculates on the nature of mathematical 
thinking. The essay is pertinent to several articles in this volume of 
What's Happening in the Mathematical Sciences. Poincaré touch- 
es on issues that are being illuminated by new research in auto- 
matic theorem proving, and he points out some of the similarities— 
and differences—between mathematical thinking and the thinking 
that goes into chess, And Poincaré’s essay itself may well have 
influenced the course of modern art, through the work of Marcel 
Duchamp. 

“Mathematical Discovery” first appeared in Science and 
Method, one of several books Poincaré wrote for the general pub- 
lic, It seems only fitting to let Henri Poincaré have the last word on 
what’s happening in the mathematical sciences. 


lhe genesis of mathematical discovery is a problem which must 
inspire the psychologist with the keenest interest. For this is 

the process in which the human mind seems to borrow least from 
the exterior world, in which it acts, or appears to act, only by itself 
and on itself, so that by studying the process of geometric thought 
we may hope to arrive at what is most essential in the human mind. 
This has long been understood, and a few months ago a review 
called /'Enseignement Mathématique, edited by MM. Laisant and 
Fehr, instituted an enquiry into the habits of mind and methods of 
work of different mathematicians. I had outlined the principal fea- 
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that there are people 
who do not understand 
mathematics? 
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tures of this article when the results of the enquiry were published, 
so that I have hardly been able to make any use of them, and I will 
content myself with saying that the majority of the evidence con- 
firms my conclusions. I do not say there is unanimity, for on an 
appeal to universal suffrage we cannot hope to obtain unanimity. 

One first fact must astonish us, or rather would astonish us if we 
were not too much accustomed to it. How does it happen that there 
are people who do not understand mathematics? If the science 
invokes only the rules of logic, those accepted by all well-formed 
minds, if its evidence is founded on principles that are common to 
all men, and that none but a madman would attempt to deny, how 
does it happen that there are so many people who are entirely 
impervious to it? 

There is nothing mysterious in the fact that every one is not 
capable of discovery. That every one should not be able to retain a 
demonstration he has once learnt is still comprehensible. But what 
does seem most surprising, when we consider it, is that any one 
should be unable to understand a mathematical argument at the 
very moment it is stated to him. And yet those who can only fol- 
low the argument with difficulty are in a majority; this is incon- 
testable, and the experience of teachers of secondary education will 
certainly not contradict me. 

And still further, how is error possible in mathematics? A 
healthy intellect should not be guilty of any error in logic, and yet 
there are very keen minds which will not make a false step in a 
short argument such as those we have to make in the ordinary 
actions of life, which yet are incapable of following or repeating 
without error the demonstrations of mathematics which are longer, 
but which are, after all, only accumulations of short arguments 
exactly analogous to those they make so easily. Is it necessary to 
add that mathematicians themselves are not infallible? 

The answer appears to me obvious. Imagine a long series of syl- 
logisms in which the conclusions of those that precede form the 
premises of those that follow. We shall be capable of grasping each 
of the syllogisms, and it is not in the passage from premises to con- 
clusion that we are in danger of going astray. But between the 
moment when we meet a proposition for the first time as the con- 
clusion of one syllogism, and the moment when we find it once 
more as the premise of another syllogism, much time will some- 
times have elapsed, and we shall have unfolded many links of the 
chain; accordingly it may well happen that we shall have forgotten 


it, or, what is more serious, forgotten its meaning. So we may 
chance to replace it by a somewhat different proposition, or to pre- 
serve the same statement but give it a slightly different meaning, 
and thus we are in danger of falling into error. 

A mathematician must often use a rule, and, naturally, he begins 
by demonstrating the rule. At the moment the demonstration is 
quite fresh in his memory he understands perfectly its meaning and 
significance, and he is in no danger of changing it. But later on he 
commits it to memory, and only applies it in a mechanical way, and 
then, if his memory fails him, he may apply it wrongly. It is thus, 
to take a simple and almost vulgar example, that we sometimes 
make mistakes in calculation, because we have forgotten our mul- 
tiplication table. 

On this view special aptitude for mathematics would be due to 
nothing but a very certain memory or a tremendous power of atten- 
tion. It would be a quality analogous to that of the whist player 
who can remember the cards played, or, to rise a step higher, to that 
of the chess player who can picture a very great number of combi- 
nations and retain them in his memory. Every good mathematician 
should also be a good chess player and vice versa, and similarly he 
should be a good numerical calculator. Certainly this sometimes 
happens, and thus Gauss was at once a geometrician of genius and 
a very precocious and very certain calculator. 

But there are exceptions, or rather I am wrong, for I cannot call 
them exceptions, otherwise the exceptions would be more numer- 
ous than the cases of conformity with the rule. On the contrary, it 
was Gauss who was an exception. As for myself, I must confess I 
am absolutely incapable of doing an addition sum without a mis- 
take, Similarly I should be a very bad chess player. I could easily 
calculate that by playing in a certain way I should be exposed to 
such and such a danger; I should then review many other moves, 
which I should reject for other reasons, and I should end by mak- 
ing the move I first examined, having forgotten in the interval the 
danger I had foreseen. 

In a word, my memory is not bad, but it would be insufficient to 
make me a good chess player. Why, then, does it not fail me in a 
difficult mathematical argument in which the majority of chess 
players would be lost? Clearly because it is guided by the general 
trend of the argument. A mathematical demonstration is not a sim- 
ple juxtaposition of syllogisms; it consists of syllogisms placed in 
a certain order, and the order in which these elements are placed is 
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much more important than the elements themselves. If I have the 
feeling, so to speak the intuition, of this order, so that I can perceive 
the whole of the argument at a glance, I need no longer be afraid of 
forgetting one of the elements; each of them will place itself natu- 
rally in the position prepared for it, without my having to make any 
effort of memory. 

It seems to me, then, as I repeat an argument I have learnt, that 
I could have discovered it. This is often only an illusion; but even 
then, even if I am not clever enough to create for myself, I redis- 
cover it myself as I repeat it. 

We can understand that this feeling, this intuition of mathemat- 
icakorder, Which enables us to guess hidden harmonies and rela- 
tions, cannot'belong to every one. Some have neither this delicate 
feeling that is difficult to define, nor a power of memory and atten- 
tion above the common, and so they are absolutely incapable of 
understanding even the first steps of higher mathematics. This 
applies to the majority of people, Others have the feeling only ina 
slight degree, but they-are gifted with an uncommon memory and a 
great capacity for-attention.\ They learn the details one after the 
other by heart, they can understand mathematics and sometimes 
apply them, but\they are. not ina condition to create. Lastly, oth- 
ers possess the special intuition I have spoken of more or less high- 
ly developed, and they’can not only understand mathematics, even 
though«their memory is in no way extraordinary, but they can 
become: creators; and seek to make discovery with more or less 
chance Of success, according as their intuition is more or less devel- 
Oped. 

What, in fact, is mathematical discovery? It does not consist in 
making new combinations with mathematical entities that are 
alteady known. That can be done by any one, and the combinations 
that could be so formed would be infinite in number, and the greater 
part of them would be absolutely devoid of interest. Discovery 
consists precisely in not constructing useless combinations, but in 
constructing those that are useful, which are an infinitely small 
minority. Discovery is discernment, selection. 

How this selection is to be made I have explained above. 
Mathematical facts worthy of being studied are those which, by 
their analogy with other facts, are capable of conducting us to the 
knowledge of a mathematical law, in the same way that experi- 
mental facts conduct us to the knowledge of a physical law. They 
are those which reveal unsuspected relations between other facts, 
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long since known, but wrongly believed to be unrelated to each 
other. 

Among the combinations we choose, the most fruitful are often 
those which are formed of elements borrowed from widely sepa- 
rated domains. I do not mean to say that for discovery it is suffi- 
cient to bring together objects that are as incongruous as possible. 
The greater part of the combinations so formed would be entirely 
fruitless, but some among them, though very rare, are the most 
fruitful of all. Discovery, as I have said, is selection. But this is 
perhaps not quite the right word. It suggests a purchaser who has 
been shown a large number of samples, and examines them one 
after the other in order to make his selection. In our case the sam- 
ples would be so numerous that a whole life would not give suffi- 
cient time to examine them. Things do not happen in this way. 
Unfruitful combinations do not so much as present themselves to 
the mind of the discoverer. In the field of his consciousness there 
never appear any but really useful combinations, and some that he 
rejects, which, however, partake to some extent of the character of 
useful combinations. Everything happens as if the discoverer were 
a secondary examiner who had only to interrogate candidates 
declared eligible after passing a preliminary test. 

But what I have said up to now is only what can be observed or 
inferred by reading the works of geometricians, provided they are 
read with some reflection. 

It is time to penetrate further, and to see what happens in the 
very soul of the mathematician. For this purpose I think I cannot 
do better than recount my personal recollections. Only I am going 
to confine myself to relating how I wrote my first treatise on 
Fuchsian functions. I must apologize, for I am going to introduce 
some technical expressions, but they need not alarm the reader, for 
he has no need to understand them. I shall say, for instance, that I 
found the demonstration of such and such a theorem under such 
and such circumstances; the theorem will have a barbarous name 
that many will not know, but that is of no importance. What is 
interesting for the psychologist is not the theorem but the circum- 
stances. 

For a fortnight I had been attempting to prove that there could 
not be any function analogous to what I have since called Fuchsian 
functions. I was at that time very ignorant. Every day I sat down 
at my table and spent an hour or two trying a great number of com- 
binations, and I arrived at no result. One night I took some black 
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coffee, contrary to my custom, and was unable to sleep. A host of 
ideas kept surging in my head; I could almost feel them jostling one 
another, until two of them coalesced, so to speak, to form a stable 
combination. When morning came, I had established the existence 
of one class of Fuchsian functions, those that are derived from the 
hypergeometric series. I had only to verify the results, which only 
took a few hours. 

Then I wished to represent these functions by the quotient of two 
series. This idea was perfectly conscious and deliberate; I was 
guided by the analogy with elliptical functions. I asked myself 
what must be the properties of these series, if they existed, and I 
succeeded without difficulty in forming the series that I have called 
Theta-Fuchsian. At this moment I left Caen, where I was then liv- 
ing, to take part in a geological conference arranged by the School 
of Mines. The incidents of the journey made me forget my mathe- 
matical work. When we arrived at Coutances, we got into a break 
to go for a drive, and, just as I put my foot on the step, the idea 
came to me, though nothing in my former thoughts seemed to have 
prepared me for it, that the transformations I had used to define 
Fuchsian functions were identical with those of non-Euclidean 
geometry. I made no verification, and had no time to do so, since I 
took up the conversation again as soon as I had sat down in the 
break, but I felt absolute certainty at once. When I got back to Caen 
I verified the result at my leisure to satisfy my conscience. 

I then began to study arithmetical questions without any great 
apparent result, and without suspecting that they could have the 
least connexion with my previous researches. Disgusted at my 
want to success, I went away to spend a few days at the seaside, and 
thought of entirely different things. One day, as I was walking on 
the cliff, the idea came to me, again with the same characteristics 
of conciseness, suddenness, and immediate certainty, that arith- 
metical transformations of indefinite ternary quadratic forms are 
identical with those of non-Euclidean geometry. Returning to 
Caen, I reflected on this result and deduced its consequences. The 
example of quadratic forms showed me that there are Fuchsian 
groups other than those which correspond with the hypergeometric 
series; I saw that I could apply to them the theory of the Theta- 
Fuchsian series, and that, consequently, there are Fuchsian func- 
tions other than those which are derived from the hypergeometric 
series, the only ones I knew up to that time. Naturally, I proposed 
to form all these functions. I laid siege to them systematically and 
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captured all the outworks one after the other. There was 
ever, which still held out, whose fall would carry with it 
central fortress. But all my efforts were of nojavail at fi 
to make me better understand the difficulty, whi 
something. All this work was perfectly consc! 

Thereupon I left for Mont-Valérien, where 
time in the army, and so my mind was preocct 
ferent matters. One day, as I was crossing the s 
the difficulty which had brought me to a stan 
at once. I did not try to fathom it immediately, 
my service was finished that I returned to the q 
elements, and had only to assemble and arrangt 
I composed my definitive treatise at a sitting a 
culty. 

Itis useless to multiply examples, and I will 
this one alone. As regards my other 
should give would be exactly similar, and the: 
by other mathematicians in the enquiry © 
Mathématique would only confirm them. 

One is at once struck by these appearances of si 
tion, obvious indications of a long course of previo 
work. The part played by this unconscious work i 
discovery seems to me indisputable, and we shall fi 
other cases where it is less evident. Often when a 
at a difficult question, he accomplishes nothing 
to work. Then he takes more or less of a rest. 
at his table. During the first half-hour he still 
then all at once the decisive idea presents itse 
might say that the conscious work proved mo 
was interrupted and the rest restored force 
mind. But it is more probable that the res' 
unconscious work, and that the result of this work 
revealed to the geometrician exactly as in the eases I have quotes 
except that the revelation, instead of coming t during a wa 
or a journey, came during a period of conscio tk, but indep 
dently of that work, which at most only p the unlocking 
process, as if it were the spur that excited into ious fo he 
results already acquired during the rest, whic then ren 
unconscious. 

There is another remark to be made regarding the conditi 
this unconscious work, which is, that it is not possible, or 
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case not fruitful, unless it is first preceded and then followed by a 
period of conscious work. These sudden inspirations are never pro- 
duced (and this is sufficiently proved already by the examples I 
have quoted) except after some days of voluntary efforts which 
appeared absolutely fruitless, in which one thought one had accom- 
plished nothing, and seemed to be on a totally wrong track. These 
efforts, however, were not as barren as one thought; they set the 
unconscious machine in motion, and without them it would not 
haye worked at all, and would not have produced anything. 

The necessity for the second period of conscious work can be 
eyen more readily understood. It is necessary to work out the results 
of the inspiration, to deduce the immediate consequences and put 
them in order and to set out the demonstrations; but, above all, it is 
necessary to verify them. I have spoken of the feeling of absolute 
certainty which accompanies the inspiration; in the cases quoted this 
feeling was not deceptive, and more often than not this will be the 
case. But we must beware of thinking that this is a rule without 
exceptions. Often the feeling deceives us without being any less dis- 
tinct on that account, and we only detect it when we attempt to 
establish the demonstration. I have observed this fact most notably 
with regard to ideas that have come to me in the morning or at night 
when I have been in bed in a semi-somnolent condition. 

Such are the facts of the case, and they suggest the following 
reflections. The result of all that precedes is to show that the 
unconscious ego, or, as it is called, the subliminal ego, plays a most 
important part in mathematical discovery. But the subliminal ego 
is generally thought of as purely automatic. Now we have seen that 
mathematical work is not a simple mechanical work, and that it 
could not be entrusted to any machine, whatever the degree of per- 
fection we suppose it to have been brought to. It is not merely a 
question of applying certain rules, of manufacturing as many com- 
binations as possible according to certain fixed laws. 

The combinations so obtained would be extremely numerous, 
useless, and encumbering. The real work of the discoverer consists 
in choosing between these combinations with a view to eliminating 
those that are useless, or rather not giving himself the trouble of 
making them at all. The rules which must guide this choice are 
extremely subtle and delicate, and it is practically impossible to 
State them in precise language; they must be felt rather than formu- 
lated. Under these conditions, how can we imagine a sieve capable 
of applying them mechanically? 


The following, then, presents itself as a first hypothesis. The 
subliminal ego is in no way inferior to the conscious ego; it is not 
purely automatic; it is capable of discernment; it has tact and light- 
ness of touch; it can select, and it can divine. More than that, it can 
divine better than the conscious ego, since it succeeds where the 
latter fails. In a word, is not the subliminal ego superior to the con- 
scious ego? The importance of this question will be readily under- 
stood. In a recent lecture, M. Boutroux showed how it had arisen 
on entirely different occasions, and what consequences would be 
involved by an answer in the affirmative. (See also the same 
author’s Science et Religion, pp. 313 et. seq.) 

Are we forced to give this affirmative answer by the facts I have 
just stated? I confess that, for my part, I should be loth to accept it. 
Let us, then, return to the facts, and see if they do not admit of some 
other explanation, 

It is certain that the combinations which present themselves to 
the mind in a kind of sudden illumination after a somewhat pro- 
longed period of unconscious work are generally useful and fruit- 
ful combinations, which appear to be the result of a preliminary 
sifting. Does it follow from this that the subliminal ego, having 
divined by a delicate intuition that these combinations could be 
useful, has formed none but these, or has it formed a great many 
others which were devoid of interest, and remained unconscious? 

Under this second aspect, all the combinations are formed as a 
result of the automatic action of the subliminal ego, but those only 
which are interesting find their way into the field of consciousness. 
This, too, is most mysterious. How can we explain the fact that, of 
the thousand products of our unconscious activity, some are invited 
to cross the threshold, while others remain outside? Is it mere 
chance that gives them this privilege? Evidently not. For instance, 
of all the excitements of our senses, it is only the most intense that 
retain our attention, unless it has been directed upon them by other 
causes. More commonly the privileged unconscious phenomena, 
those that are capable of becoming conscious, are those which, 
directly or indirectly, most deeply affect our sensibility. 

It may appear surprising that sensibility should be introduced in 
connexion with mathematical demonstrations, which, it would 
seem, can only interest the intellect. But not if we bear in mind the 
feeling of mathematical beauty, of the harmony of numbers and 
forms and of geometric elegance. It is a real aesthetic feeling that 
all true mathematicians recognize, and this is truly sensibility. 
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Now, what are the mathematical entities to which we attribute 
this character of beauty and elegance, which are capable of devel- 
oping in us a kind of aesthetic emotion? Those whose elements are 
harmoniously arranged so that the mind can, without effort, take in 
the whole without neglecting the details. This harmony is at once 
a satisfaction to our aesthetic requirements, and an assistance to the 
mind which it supports and guides. At the same time, by setting 
before our eyes a well-ordered whole, it gives us a presentiment of 
a mathematical law. Now, as I have said above, the only mathe- 
matical facts worthy of retaining our attention and capable of being 
useful are those which can make us acquainted with a mathemati- 
cal law. Accordingly we arrive at the following conclusion. The 
useful combinations are precisely the most beautiful, I mean those 
that can most charm that special sensibility that all mathematicians 
know, but of which laymen are so ignorant that they are often 
tempted to smile at it. 

What follows, then? Of the very large number of combinations 
which the subliminal ego blindly forms, almost all are without 
interest and without utility. But, for that very reason, they are with- 
out action on the aesthetic sensibility; the consciousness will never 
know them. A few only are harmonious, and consequently at once 
useful and beautiful, and they will be capable of affecting the geo- 
metrician’s special sensibility I have been speaking of; which, once 
aroused, will direct our attention upon them, and will thus give 
them the opportunity of becoming conscious. 

This is only a hypothesis, and yet there is an observation which 
tends to confirm it. When a sudden illumination invades the math- 
ematician’s mind, it most frequently happens that it does not mis- 
lead him. But it also happens sometimes, as I have said, that it will 
not stand the test of verification. Well, it is to be observed almost 
always that this false idea, if it had been correct. would have flat- 
tered our natural instinct for mathematical elegance. 

Thus it is this special aesthetic sensibility that plays the part of the 
delicate sieve of which I spoke above, and this makes it sufficiently 
clear why the man who has it not will never be a real discoverer. 

All the difficulties, however, have not disappeared. The con- 
scious ego is strictly limited, but as regards the subliminal ego, we 
do not know its limitations, and that is why we are not too loth to 
suppose that in a brief space of time it can form more different 
combinations than could be comprised in the whole life of a con- 
scient being. These limitations do exist, however. Is it conceivable 


that it can form all the possible combinations, M HEM ATIC AL 
° 


whose number staggers the imagination? 


Nevertheless this would seem to be necessary, for if it ys L S C O O 2 As 


produces only a small portion of the combinations, and 
that by chance, there will be very small likelihood of the right one, 
the one that must be selected, being found among them. 

Perhaps we must look for the explanation in that period of pre- 
liminary conscious work which always precedes all fruitful uncon= 
scious work. If I may be permitted a crude comparison, let us rep- 
resent the future elements of our combinations as something resem- 
bling Epicurus’s hooked atoms. When the mind is in complete 
repose these atoms are immovable; they are, so to speak, attached to 
the wall. This complete repose may continue indefinitely without 
the atoms meeting, and, consequently, without the possibility of the 
formation of any combination. 

On the other hand, during a period of apparent repose, but of 
unconscious work, some of them are detached from the wall and set 
in motion. They plough through space in all directions, like a 
swarm of gnats, for instance, or, if we prefer a more learned com- 
parison, like the gaseous molecules in the kinetic theory of gases. 
Their mutual collisions may then produce new combinations. 

What is the part to be played by the preliminary conscious work? 
Clearly it is to liberate some of these atoms, to detach them from the 
wall and set them in motion. We think we haye accomplished noth- 
ing, when we have stirred up the elements in a thousand different 
ways to try to arrange them, and have not succeeded in finding a sat- 
isfactory arrangement. But after this agitation imparted to them by 
our will, they do not return to their original repose, but continue to 
circulate freely. 

Now our will did not select them at random, but in pursuit of a 
perfectly definite aim. Those it has liberated are not, therefore, 
chance atoms; they are those from which we may reasonably expect 
the desired solution. The liberated atoms will then experience col- 
lisions, either with each other, or with the atoms that have remained 
stationary, which they will run against in their course. I apologize 
once more. My comparison is very crude, but I cannot well see how 
I could explain my thought in any other way. 

However it be, the only combinations that have any chance of 
being formed are those in which one at least of the elements is one 
of the atoms deliberately selected by our will. Now it is evidently 
among these that what I called just now the right combination is to 
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be found. Perhaps there is here a means of modifying what was 
paradoxical in the original hypothesis. 

Yet another observation. It never happens that unconscious 
work supplies ready-made the result of a lengthy calculation in 
which we have only to apply fixed rules. It might be supposed that 
the subliminal ego, purely automatic as it is, was peculiarly fitted 
for this kind of work, which is, in a sense, exclusively mechanical. 
It would seem that, by thinking overnight of the factors of a multi- 
plication sum, we might hope to find the product ready-made for us 
on waking; or, again, that an algebraical calculation, for instance, 
or a verification could be made unconsciously. Observation proves 
that such is by no means the case. All that we can hope from these 
inspirations, which are the fruits of unconscious work, is to obtain 
points of departure for such calculations. As for the calculations 
themselves, they must be made in the second period of conscious 
work which follows the inspiration, and in which the results of the 
inspiration are verified and the consequences deduced. The rules of 
these calculations are strict and complicated; they demand disci- 
pline, attention, will, and consequently consciousness. In the sub- 
liminal ego, on the contrary, there reigns what I would call liberty, 
if one could give this name to the mere absence of discipline and to 
disorder born of chance. Only, this very disorder permits of unex- 
pected couplings. 

I will make one last remark. When I related above some per- 
sonal observations, I spoke of a night of excitement, on which I 
worked as though in spite of myself. The cases of this are frequent, 
and it is not necessary that the abnormal cerebral activity should be 
caused by a physical stimulant, as in the case quoted. Well, it 
appears that, in these cases, we are ourselves assisting at our own 
unconscious work, which becomes partly perceptible to the over- 
excited consciousness, but does not on that account change its 
nature. We then become vaguely aware of what distinguishes the 
two mechanisms, or, if you will, of the methods of working of the 
two egos. The psychological observations I have thus succeeded in 
making appear to me, in their general characteristics, to confirm the 
views I have been enunciating. 

Truly there is great need of this, for in spite of everything they 
are and remain largely hypothetical. The interest of the question is 
so great that I do not regret having submitted them to the reader. 
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